Remove elements from a pandas DataFrame using drop() and truncate() Methods

Overview:

  • A pandas DataFrame is a 2-dimensional, heterogeneous container built using ndarray as the underlying.
  • It is often required in data processing to remove unwanted rows and/or columns from DataFrame and to create new DataFrame from the resultant Data.

Remove rows and columns of DataFrame using drop():

  • Specific rows and columns can be removed from a DataFrame object using the drop() instance method.
  • The drop method can be specified of an axis – 0 for columns and 1 for rows. Similar to axis the parameter, index can be used for specifying rows and columns can be used for specifying columns.

Remove rows or columns of DataFrame using truncate():

  • The truncate() method removes rows or columns at before-1 and after+1 positions. The before and after are parameters of the truncate() method that specify the thresholds of indices using which the rows or columns are discarded before a new DataFrame is returned.  
  • By default truncate() removes rows. If columns need to be removed, axis="columns" can be specified while calling the method.
  • In the similar way, on a Series instance which is a one dimensional container, the truncate() and drop() methods of the pandas Series class can be used.

Example – Remove rows from a DataFrame using drop():

# Example Python program that removes specific rows from

# a pandas DataFrame instance

import pandas as pds

 

# A dictionary of cities vs some measurement

subjectVsScores = {"City1":[78, 45, 50, 85, 63, 72, 40, 78, 60, 69],

                   "City2":[52, 67, 81, 65, 83, 42, 76, 54, 73, 81],

                   "City3":[67, 55, 81, 62, 58, 71, 57, 45, 49, 51]

}

 

# Create a dataframe using the dictionary

dataFrame = pds.DataFrame(subjectVsScores);

print(dataFrame);

 

# Drop data for City2 from the DataFrame

newFrame = dataFrame.drop("City2", axis=1);

print("New DataFrame after removal of a row:");

print(newFrame);

 

Output:

   City1  City2  City3

0     78     52     67

1     45     67     55

2     50     81     81

3     85     65     62

4     63     83     58

5     72     42     71

6     40     76     57

7     78     54     45

8     60     73     49

9     69     81     51

New DataFrame after removal of a row:

   City1  City3

0     78     67

1     45     55

2     50     81

3     85     62

4     63     58

5     72     71

6     40     57

7     78     45

8     60     49

9     69     51

 

Example – Removing columns from a pandas DataFrame using drop():

# Example Python program to drop columns

# from a pandas DataFrame instance using

# drop() method

import pandas as pds

 

# A matrix as list of tuples

matrix = [(0, 20, 30),

          (40, 0, 60),

          (70, 80, 0)];

 

# Create a DataFrame instance

df = pds.DataFrame(matrix);

 

# Remove 1st and 3rd columns

newdf1 = df.drop([0, 2], axis=1);

print("New DataFrame after removing 1st and 3rd columns:");

print(newdf1);

 

# Remove 2nd column

newdf2 = df.drop(columns=1);

print("New DataFrame after removing 2nd column:");

print(newdf2);

 

 

Output:

New DataFrame after removing 1st and 3rd columns:

    1

0  20

1   0

2  80

New DataFrame after removing 2nd column:

    0   2

0   0  30

1  40  60

2  70   0

 

Example – Removing rows and columns from a pandas DataFrame using  drop:

# Example Python program that removes rows and columns

# from a pandas DataFrame

 

import pandas as pds

 

dataMatrix = [(0.6, 0.7, 0.2),

              (0.1, 0.4, 0.5),

              (0.3, 0.9, 0.2),

              (0.6, 0.8, 0.1)];

             

dataFrame = pds.DataFrame(dataMatrix,

                          columns = ["E1", "E2", "E3"],

                          index   = ["E5", "E6", "E7", "E8"]);

                         

print("Original DataFrame:");                         

print(dataFrame);

 

# Remove 2nd row and 3rd column to produce a new DataFrame

newdf = dataFrame.drop(index="E6", columns="E3");

print("New DataFrame obtained by removing 2nd row and 3rd column:");

print(newdf);

 

Output:

Original DataFrame:

     E1   E2   E3

E5  0.6  0.7  0.2

E6  0.1  0.4  0.5

E7  0.3  0.9  0.2

E8  0.6  0.8  0.1

New DataFrame obtained by removing 2nd row and 3rd column:

     E1   E2

E5  0.6  0.7

E7  0.3  0.9

E8  0.6  0.8

 

Example - Removing rows using a threshold on the DataFrame indices with truncate():

# Example Python program that truncates rows of a pandas DataFrame using truncate()

import pandas as pds

 

matrix = [(0, 1, 3),

          (1, 2, 1),

          (2, 3, 0)];

         

df = pds.DataFrame(matrix, columns=["a", "b", "c"]);

newdf = df.truncate(before=1, after=1);

print("Contents of the DataFrame:");

print(df);

 

print("Contents of the DataFrame after removing rows using truncate():");

print(newdf);

 

Output:

Contents of the DataFrame:

   a  b  c

0  0  1  3

1  1  2  1

2  2  3  0

Contents of the DataFrame after removing rows using truncate():

   a  b  c

1  1  2  1

 

Example – Removal of DataFrame columns using truncate():

# Example Python program that removes columns from a

# pandas DataFrame instance using the method truncate()

import pandas as pds

 

data = [(5, 0, 5, 5),

        (10, 1, 10, 2),

        (15, 10, 5, 10)];

 

# Create a DataFrame instance

dataFrameObject = pds.DataFrame(data);

print("Rows and columns of the original DataFrame:");

print(dataFrameObject);

 

# Remove columns and print the new DataFrame

newDataFrame = dataFrameObject.truncate(1,2,axis="columns");

print("DataFrame contents after removal of columns using truncate():");

print(newDataFrame);

 

Output:

Rows and columns of the original DataFrame:

    0   1   2   3

0   5   0   5   5

1  10   1  10   2

2  15  10   5  10

DataFrame contents after removal of columns using truncate():

    1   2

0   0   5

1   1  10

2  10   5

 


Copyright 2024 © pythontic.com