Reputation: 340
I am trying to change the values of all the columns based on a condition in pandas (if the value of column 1 is greater than 1 set it to 1), I can do that for one column without facing any error, but how would it be possible to do it for multiple columns(if all the columns values are greater than one set them to one)?
df.loc[df['column1']>=1,'column1'] =1
the above line works fine but the following line does not work:
df.loc[df['column1','column2']>=1,['column1','column2']] = 1
I get the following error, any help would be appreciated
KeyError: ('column1', 'column2')
Upvotes: 1
Views: 93
Reputation: 75080
I dont think .loc[]
is best for multidimensional key, intead try with np.where()
:
df[['column1','column2']]=np.where(df[['column1','column2']]>=1,1,df[['column1','column2']])
Adding an example:
np.random.seed(123)
df=pd.DataFrame(np.random.randint(-2,4,20).reshape(5,4),
columns=[f'column{i+1}' for i in range(4)])
print(df)
column1 column2 column3 column4
0 3 0 2 0
1 -1 1 0 1
2 -1 -1 -2 -1
3 -1 -2 -2 -1
4 1 3 2 -2
df[['column1','column2']]=np.where(df[['column1','column2']]>=1,1,df[['column1','column2']])
print(df)
column1 column2 column3 column4
0 1 0 2 0
1 -1 1 0 1
2 -1 -1 -2 -1
3 -1 -2 -2 -1
4 1 1 2 -2
Upvotes: 2