# Calculating Central Tendency And Dispersion Values On The Rolling Windows Of A Pandas DataFrame

## Overview:

• The measures of the central tendency – mean, median and mode describe where most of the data is located and majorly what their values are. Similarly the measures of dispersion variance and standard deviation describe how the data is varying around its mean.
• The methods mean(), median() invoked on a rolling object obtained from a Pandas DataFrame calculate the mean values and the median values for the windows. The methods var() and std() calculate the variance and the standard deviation for the data from the rolling windows.

## Example:

• Assuming price1 and price2 are the trade prices of a stock taken from two different execution venues, the resultant frames obtained from the rolling window calculations provide the mean price, median price, variance in the mean price and the standard deviation.
 # Example Python program that  # uses pandas rolling window functions # to calculate central tendency measures and dispersion # measures import pandas as pd # Create a DataFrame data = [(pd.Timestamp(1656005844,unit='s'), 24.5, 23.1),         (pd.Timestamp(1656005845,unit='s'), 24.1, 23.5),         (pd.Timestamp(1656005846,unit='s'), 25.2, 24.3),         (pd.Timestamp(1656005847,unit='s'), 25.3, 23.2)]; df = pd.DataFrame(data=data, columns=["time", "price1", "price2"]); print("DataFrame:") print(df) # Get rolling windows of two rows numRows = 3 r = df.rolling(numRows)    print(r) # Calculate mean on the rolling window of prices r1 = r["price1","price2"].mean() r1.columns = ["mean price1", "mean price2"] r1.index = df["time"] print(r1) # Calculate the median r2 = r["price1","price2"].median() r2.columns = ["median price1", "median price2"] r2.index = df["time"] print(r2) # Calculate the variance r3 = r["price1","price2"].var() r3.columns = ["price1 variance", "price2 variance"] r3.index = df["time"] print(r3) # Calculate the standard deviation r4 = r["price1","price2"].std() r4.columns = ["price1 std", "price2 std"] r4.index = df["time"] print(r4)

## Output:

 DataFrame:                  time  price1  price2 0 2022-06-23 17:37:24    24.5    23.1 1 2022-06-23 17:37:25    24.1    23.5 2 2022-06-23 17:37:26    25.2    24.3 3 2022-06-23 17:37:27    25.3    23.2 Rolling [window=3,center=False,axis=0,method=single]                      mean price1  mean price2 time                                          2022-06-23 17:37:24          NaN          NaN 2022-06-23 17:37:25          NaN          NaN 2022-06-23 17:37:26    24.600000    23.633333 2022-06-23 17:37:27    24.866667    23.666667                      median price1  median price2 time                                              2022-06-23 17:37:24            NaN            NaN 2022-06-23 17:37:25            NaN            NaN 2022-06-23 17:37:26           24.5           23.5 2022-06-23 17:37:27           25.2           23.5                      price1 variance  price2 variance time                                                  2022-06-23 17:37:24              NaN              NaN 2022-06-23 17:37:25              NaN              NaN 2022-06-23 17:37:26         0.310000         0.373333 2022-06-23 17:37:27         0.443333         0.323333                      price1 std  price2 std time                                        2022-06-23 17:37:24         NaN         NaN 2022-06-23 17:37:25         NaN         NaN 2022-06-23 17:37:26    0.556776    0.611010 2022-06-23 17:37:27    0.665833    0.568624