Reputation: 918
import pandas as pd
import numpy as np
pd.DataFrame(
{'a':[0,1,2,3],
'b':[np.nan, np.nan, np.nan,3]}
).apply(lambda x: x> 1)
returns me False
for the column b, whereas I would like to get NaN?
a b
0 False False
1 False False
2 True False
3 True True
Expected
a b
0 False NaN
1 False NaN
2 True NaN
3 True True
I'd really like my arithmetics to keep track of where I had data and where not, how might I achieve that?
Upvotes: 1
Views: 230
Reputation: 862681
Use DataFrame.mask
or DataFrame.where
with DataFrame.isna
or DataFrame.notna
:
df = df.apply(lambda x: x> 1).mask(df.isna())
#df = df.apply(lambda x: x> 1).where(df.notna())
For better performance avoid apply
:
df = (df > 1).mask(df.isna())
#df = (df > 1).where(df.notna())
print (df)
a b
0 False NaN
1 False NaN
2 True NaN
3 True 1.0
Last use nullable boolean:
df = df.astype('boolean')
print (df)
a b
0 False <NA>
1 False <NA>
2 True <NA>
3 True True
Upvotes: 1