Ben Pap
Ben Pap

Reputation: 2579

Dropping column during for loop - Pandas

I have two basic DataFrames, and I combine them into a list called dfCombo:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(12).reshape(3,4), columns=['A', 'B', 'C', 'D'])
df2 = pd.DataFrame(np.arange(12,24).reshape(3,4), columns=['A', 'B', 'C', 'D'])
dfCombo = [df, df2]

They are both 3x4 DF's with 4 columns A, B, C, D.

I am able to use a for loop to add a column to both the DF with the following code:

for df3 in dfCombo:
    df3['E'] = df3['A'] + df3['B']

With this both df and df2 will both have an new column E. However when I try to drop a column using this method with the below code, no columns are dropped:

for df3 in dfCombo:
    df3 = df3.drop('B', axis = 1)

or

for df3 in dfCombo:
    df3 = df3.drop(columns = ['B'])

If I use the same code on a single DF the column is dropped:

df2 = df2.drop('B', axis = 1)

or

df2 = df2.drop(columns = ['B'])

If you could help me understand what is going on I would be most appreciative.

Upvotes: 1

Views: 8449

Answers (1)

rahlf23
rahlf23

Reputation: 9019

You need to use inplace=True:

for df3 in dfCombo:
    df3.drop('B', axis = 1, inplace=True)

Which returns:

   A   C   D   E
0  0   2   3   1
1  4   6   7   9
2  8  10  11  17

    A   C   D   E
0  12  14  15  25
1  16  18  19  33
2  20  22  23  41

The default inplace=False is intended for assigning back to the original dataframe, because it returns a new copy. However inplace=True operates on the same copy and returns None, therefore there is no need to assign back to the original dataframe.

Upvotes: 6

Related Questions