Overview:
- The measures of the central tendency – mean, median and mode describe where most of the data is located and majorly what their values are. Similarly the measures of dispersion variance and standard deviation describe how the data is varying around its mean.
- The methods mean(), median() invoked on a rolling object obtained from a Pandas DataFrame calculate the mean values and the median values for the windows. The methods var() and std() calculate the variance and the standard deviation for the data from the rolling windows.
Example:
- Assuming price1 and price2 are the trade prices of a stock taken from two different execution venues, the resultant frames obtained from the rolling window calculations provide the mean price, median price, variance in the mean price and the standard deviation.
# Example Python program that import pandas as pd # Create a DataFrame df = pd.DataFrame(data=data, columns=["time", "price1", "price2"]); # Get rolling windows of two rows # Calculate mean on the rolling window of prices # Calculate the median # Calculate the variance # Calculate the standard deviation |
Output:
DataFrame: time price1 price2 0 2022-06-23 17:37:24 24.5 23.1 1 2022-06-23 17:37:25 24.1 23.5 2 2022-06-23 17:37:26 25.2 24.3 3 2022-06-23 17:37:27 25.3 23.2 Rolling [window=3,center=False,axis=0,method=single] mean price1 mean price2 time 2022-06-23 17:37:24 NaN NaN 2022-06-23 17:37:25 NaN NaN 2022-06-23 17:37:26 24.600000 23.633333 2022-06-23 17:37:27 24.866667 23.666667 median price1 median price2 time 2022-06-23 17:37:24 NaN NaN 2022-06-23 17:37:25 NaN NaN 2022-06-23 17:37:26 24.5 23.5 2022-06-23 17:37:27 25.2 23.5 price1 variance price2 variance time 2022-06-23 17:37:24 NaN NaN 2022-06-23 17:37:25 NaN NaN 2022-06-23 17:37:26 0.310000 0.373333 2022-06-23 17:37:27 0.443333 0.323333 price1 std price2 std time 2022-06-23 17:37:24 NaN NaN 2022-06-23 17:37:25 NaN NaN 2022-06-23 17:37:26 0.556776 0.611010 2022-06-23 17:37:27 0.665833 0.568624 |