Compute correlation coefficient for the variables represented by two pandas.series objects

Overview:

Majority of the Data Analysis done using the Python library pandas, involve the data structures Series and DataFrame. While pandas.Series being a 1–dimensional mutable, heterogeneous array and the pandas.DataFrame being a 2–dimensional mutable, heterogeneous array - both Series and DataFrame are implemented using the numpy's ndarray as the underlying Data Structure.
The classes pandas.Series and pandas.DataFrame provide methods for holding, re-shaping the data and performing statistical and mathematical operations on the data.
The method series.corr() finds the correlation between two variables represented by two pandas.Series instances.The DataFrame.corr() method finds correlation coefficient between two pandas.DataFrame columns.
Correlation is a statistical measure that finds how far two variables are related if at all there exists a relationship between them. Examples include, Per capita income and life expectancy, Forest coverage and annual rainfall of a region. Correlation is measured by the Correlation Coefficient (r).
The value of the correlation coefficient is always in the range of -1 to +1.
When the correlation coefficient is +1, the two variables are correlated in the positive direction. Which means, if a variable increases in value by +1 the other variable also increases by +1. If a variable increases by +1 and the other variable increases by +0.5 then they are still correlated in the positive direction. When the correlation coefficient is -1, the two variables are negatively correlated. This means if a variable increases by one unit in positive direction the other variable increases by one unit in the negative direction.
There are several methods to measure the correlation coefficient. The pandas method series.corr() supports calculating correlation coefficient using the methods: Pearson, Kendall and Spearman. It also supports any other custom method through the parameter callable. The custom function calculating the correlation coefficient should take two one-dimensional ndarray objects as parameters and should return a float.

Example:

# Python example to find the Correlation coefficient
# of two variables represented by two pandas Series instances
import pandas as pd

# Prices of house
housePriceList = [250, 265, 270, 262, 268, 272];

# The years
yearList = [2014, 2015, 2016, 2017, 2018, 2019];

# House prices loaded into a pandas series
housePrices = pd.Series(housePriceList);

# Years loaded into a pandas series
years = pd.Series(yearList);

# Find the correlation coefficient between house price and year
corr_value = housePrices.corr(years, method="pearson");
print("Correlation coefficient between house price and year (Method:Pearson)");
print(round(corr_value,2));

corr_value = housePrices.corr(years, method="kendall");
print("Correlation coefficient between house price and year (Method:Kendall rank correlation coefficient)");
print(round(corr_value,2));

corr_value = housePrices.corr(years, method="spearman");
print("Correlation coefficient between house price and year (Method:Spearman rank correlation coefficient)");
print(round(corr_value,2));

Output:

Correlation coefficient between house price and year (Method:Pearson)

0.75

Correlation coefficient between house price and year (Method:Kendall rank correlation coefficient)

0.6

Correlation coefficient between house price and year (Method:Spearman rank correlation coefficient)

0.71