Overview:
- In a distribution, measures of central tendency identify where the data is centered. Mean, Median and the Mode are commonly used measures of central tendency.
- The mean() and median() methods return the mean and median of values for a given axis in a pandas DataFrame instance.
- Median is the middle value of the dataset which divides it into upper half and a lower half.
- Mode is the most frequently occuring value in a dataset or distribution. A dataset can have more than one mode.
- Similar to the measures of central tendency quantile is also a measure of location. In a dataset, it identifies a location at or below which a given portion of data lies.
Example:
import pandas as pd
dataMatrix = {"D1":[135, 137, 136, 138, 138], "D2":[43, 42, 42, 42, 42], "D3":[72, 73, 72, 72, 73], "D4":[100, 102, 100, 103, 104] };
dataFrame = pd.DataFrame(data=dataMatrix);
print("DataFrame:"); print(dataFrame);
print("Mean:Computed column-wise:"); meanData = dataFrame.mean(); print(meanData);
print("Mean:Computed row-wise:"); meanData = dataFrame.mean(axis=1); print(meanData);
print("Median:Computed column-wise:"); medianData = dataFrame.median(); print(medianData);
print("Median:Computed row-wise:"); medianData = dataFrame.median(axis=1); print(medianData);
print("Mode:Computed column-wise:"); modeData = dataFrame.mode(); print(modeData);
print("Mode:Computed row-wise:"); modeData = dataFrame.mode(axis=1); print(modeData); |
Output:
DataFrame: D1 D2 D3 D4 0 135 43 72 100 1 137 42 73 102 2 136 42 72 100 3 138 42 72 103 4 138 42 73 104 Mean:Computed column-wise: D1 136.8 D2 42.2 D3 72.4 D4 101.8 dtype: float64 Mean:Computed row-wise: 0 87.50 1 88.50 2 87.50 3 88.75 4 89.25 dtype: float64 Median:Computed column-wise: D1 137.0 D2 42.0 D3 72.0 D4 102.0 dtype: float64 Median:Computed row-wise: 0 86.0 1 87.5 2 86.0 3 87.5 4 88.5 dtype: float64 Mode:Computed column-wise: D1 D2 D3 D4 0 138 42 72 100 Mode:Computed row-wise: 0 1 2 3 0 43 72 100 135 1 42 73 102 137 2 42 72 100 136 3 42 72 103 138 4 42 73 104 138 |