Reputation: 306
I have big datasets of more than 1 million rows and varying column size(sometimes 1 column or sometimes different number of columns). initially, I created a script, it was working fine. but recently I ran into an issue which can be replicated with the below script.
import pandas as pd
df=pd.DataFrame({'a':[0,0],'b':[100,1]})
df[df>0]='S1'
df[df==0]='S0'
Error:
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
line 3 and 4 can be interchangeable and the issue will be at the 4th line.
initial df:
a b
0 100
0 1
Expecting df:
a b
S0 S1
S0 S1
Upvotes: 2
Views: 871
Reputation: 402263
For DataFrame-wide replacements, that isn't quite right. Use where
or mask
:
df = df.where(df == 0, 'S1').where(df > 0, 'S0')
df
a b
0 S0 S1
1 S0 S1
Alternatively, you can use np.select
:
df[:] = np.select([df > 0, df == 0], ['S1', 'S0'], default=df)
df
a b
0 S0 S1
1 S0 S1
Upvotes: 2