Overview:
- Kurtosis is one of the two measures that quantify shape of of a distribution. The another measure is skewness.
- Kurtosis describes the peakedness of the distribution.
- If the distribution is tall and thin it is called a leptokurtic distribution. Values in a leptokurtic distribution are near the mean or at the extremes.
- A flat distribution where the values are moderately spread out (i.e., unlike leptokurtic) is called platykurtic distribution.
- A distribution whose shape is in between a leptokurtic distribution and a platykurtic distribution is called a mesokurtic distribution. A mesokurtic distribution looks more close to a normal distribution.
Kurtosis function in pandas:
- The pandas DataFrame has a computing method kurtosis() which computes the kurtosis for a set of values across a specific axis (i.e., a row or a column).
- The pandas library function kurtosis() computes the Fisher's Kurtosis which is obtained by subtracting the Pearson's Kurtosis by three. With Fisher's Kurtosis, definition a normal distribution has a kurtosis of 0.
Example:
import pandas as pd import numpy as np
dataMatrix = [(65,75,74,73,95,76,62,100), (101,102,103,107,157,160,191,192)];
dataFrame = pd.DataFrame(data=dataMatrix); kurt = dataFrame.kurt(axis=1); print("Data:"); print(dataFrame); print("Kurtosis:"); print(kurt);
dataMatrix = [(70,90,90,100,120,120,100,121,125,115,112), (58.22,39.33,-30.44,36.77,20.80,-73.95,-39.99,91.03,-138.01,-20,None)];
dataFrame = pd.DataFrame(data=dataMatrix); kurt = dataFrame.kurt(axis=1); print("Data:"); print(dataFrame); print("Kurtosis:"); print(kurt); |
Output:
Data: 0 1 2 3 4 5 6 7 0 65 75 74 73 95 76 62 100 1 101 102 103 107 157 160 191 192 Kurtosis: 0 -0.246357 1 -2.044655 dtype: float64 Data: 0 1 2 3 4 5 6 7 8 9 10 0 70.00 90.00 90.00 100.00 120.0 120.00 100.00 121.00 125.00 115 112.0 1 58.22 39.33 -30.44 36.77 20.8 -73.95 -39.99 91.03 -138.01 -20 NaN Kurtosis: 0 0.057451 1 0.067184 dtype: float64 |