Bhanu Tez
Bhanu Tez

Reputation: 306

Setting values in a DataFrame based on condition

I have big datasets of more than 1 million rows and varying column size(sometimes 1 column or sometimes different number of columns). initially, I created a script, it was working fine. but recently I ran into an issue which can be replicated with the below script.

import pandas as pd
df=pd.DataFrame({'a':[0,0],'b':[100,1]})
df[df>0]='S1'
df[df==0]='S0'

Error:

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

line 3 and 4 can be interchangeable and the issue will be at the 4th line.

initial df:

a b
0 100
0 1 

Expecting df:

a  b
S0 S1
S0 S1 

Upvotes: 2

Views: 871

Answers (1)

cs95
cs95

Reputation: 402263

For DataFrame-wide replacements, that isn't quite right. Use where or mask:

df = df.where(df == 0, 'S1').where(df > 0, 'S0')
df
    a   b
0  S0  S1
1  S0  S1

Alternatively, you can use np.select:

df[:] = np.select([df > 0, df == 0], ['S1', 'S0'], default=df)
df
    a   b
0  S0  S1
1  S0  S1

Upvotes: 2

Related Questions