Combining or merging two pandas dataframes into to one dataframe

Overview:

  • The combine() method of pandas DataFrame class in Python, combines/merges two DataFrame instances into one.
  • It takes a predicate function that determines what element to be placed in the new DataFrame, between the two elements from the DataFrame instances (in a column-wise fashion). The function can be a true element wise compare function as well like numpy.maximum().

Example:

import pandas as pd

 

# Check standard deviation of columns and select

# the column with a smaller standard deviation

def selectVal(df1, df2):

    selected = None

 

    if df1.std() < df2.std():

        selected = df1;

    else:

        selected =  df2;

   

    return selected;

 

# Data for DataFrame objects

data1 = [(0, 1, 1),

         (2, 3, 5),

         (8, 13, 21)];

        

data2 = [(2, 3, 5),

         (7, 11, 13),

         (17, 19, 23)];

        

dataFrame1  = pd.DataFrame(data=data1);

dataFrame2  = pd.DataFrame(data=data2);

combined    = dataFrame1.combine(dataFrame2, selectVal);

 

print("First Data Frame:");

print(dataFrame1);

 

print("Second Data Frame:");

print(dataFrame2);

 

print("New merged/combined Data Frame:");

print(combined);

 

Output:

First Data Frame:

   0   1   2

0  0   1   1

1  2   3   5

2  8  13  21

Second Data Frame:

    0   1   2

0   2   3   5

1   7  11  13

2  17  19  23

New merged/combined Data Frame:

   0   1   2

0  0   1   5

1  2   3  13

2  8  13  23

 

 


Copyright 2024 © pythontic.com