Reputation: 415
I have a dataframe and I want to replace some values in that column based on a condition. My dataframe looks like this
ID customer_name arrival_month leaving_month
1524 ABC 201508 201605
1185 XYZ 201701 201801
8456 IJK 201801 201902
I am trying a simple operation here. I want to change the values in leaving_month column by currentmonth value =201802 where leaving_month>201802. I have tried by .loc and it gives the error below.
df.loc[(df['leaving_month'] > 201802)] = 201802
KeyError: 'leaving_month'
I have also tried np.where which also gives an error.
df['leaving_month']=np.where(df['leaving_month']>currentmonth, currentmonth)
KeyError: 'leaving_month'
I have also tried with brute looping
for o in range(len(df)):
if(df.loc[o,'leaving_month']>currentmonth):
df.loc[o,'leaving_month']=currentmonth
IndexingError: Too many indexers
Can someone please point me in the right direction or figure out what am I doing wrong or suggest a better solution? This is quite simple problem but somehow I am not getting through.
Upvotes: 0
Views: 91
Reputation: 19124
You are replacing an entire row. Instead, set a specific column with the .loc
. See the second indexer in the solution below.
df.loc[df['leaving_month'] > 201802, 'leaving_month'] = 201802
df
returns
ID customer_name arrival_month leaving_month
0 1524 ABC 201508 201605
1 1185 XYZ 201701 201801
2 8456 IJK 201801 201802
You can read about DataFrame indexing in the Pandas docs.
Upvotes: 1