Overview:
- The combine() method of pandas DataFrame class in Python, combines/merges two DataFrame instances into one.
- It takes a predicate function that determines what element to be placed in the new DataFrame, between the two elements from the DataFrame instances (in a column-wise fashion). The function can be a true element wise compare function as well like numpy.maximum().
Example:
import pandas as pd
# Check standard deviation of columns and select # the column with a smaller standard deviation def selectVal(df1, df2): selected = None
if df1.std() < df2.std(): selected = df1; else: selected = df2;
return selected;
# Data for DataFrame objects data1 = [(0, 1, 1), (2, 3, 5), (8, 13, 21)];
data2 = [(2, 3, 5), (7, 11, 13), (17, 19, 23)];
dataFrame1 = pd.DataFrame(data=data1); dataFrame2 = pd.DataFrame(data=data2); combined = dataFrame1.combine(dataFrame2, selectVal);
print("First Data Frame:"); print(dataFrame1);
print("Second Data Frame:"); print(dataFrame2);
print("New merged/combined Data Frame:"); print(combined); |
Output:
First Data Frame: 0 1 2 0 0 1 1 1 2 3 5 2 8 13 21 Second Data Frame: 0 1 2 0 2 3 5 1 7 11 13 2 17 19 23 New merged/combined Data Frame: 0 1 2 0 0 1 5 1 2 3 13 2 8 13 23 |