N91
N91

Reputation: 415

Conditional replacing of column values in dataframe?

I have a dataframe and I want to replace some values in that column based on a condition. My dataframe looks like this

ID    customer_name   arrival_month    leaving_month
1524   ABC              201508           201605 
1185   XYZ              201701           201801
8456   IJK              201801           201902

I am trying a simple operation here. I want to change the values in leaving_month column by currentmonth value =201802 where leaving_month>201802. I have tried by .loc and it gives the error below.

df.loc[(df['leaving_month'] > 201802)] = 201802
KeyError: 'leaving_month'

I have also tried np.where which also gives an error.

df['leaving_month']=np.where(df['leaving_month']>currentmonth, currentmonth)
KeyError: 'leaving_month'

I have also tried with brute looping

for o in range(len(df)):
            if(df.loc[o,'leaving_month']>currentmonth):
                df.loc[o,'leaving_month']=currentmonth
IndexingError: Too many indexers

Can someone please point me in the right direction or figure out what am I doing wrong or suggest a better solution? This is quite simple problem but somehow I am not getting through.

Upvotes: 0

Views: 91

Answers (1)

Alex
Alex

Reputation: 19124

You are replacing an entire row. Instead, set a specific column with the .loc. See the second indexer in the solution below.

df.loc[df['leaving_month'] > 201802, 'leaving_month'] = 201802
df

returns

     ID customer_name  arrival_month  leaving_month
0  1524           ABC         201508         201605
1  1185           XYZ         201701         201801
2  8456           IJK         201801         201802

You can read about DataFrame indexing in the Pandas docs.

Upvotes: 1

Related Questions