Pandas, apply simple function to NaN returns value instead of NaN?

Question

import pandas as pd
import numpy as np

pd.DataFrame(
  {'a':[0,1,2,3],
   'b':[np.nan, np.nan, np.nan,3]}
).apply(lambda x: x> 1)

returns me False for the column b, whereas I would like to get NaN?

    a       b
0   False   False
1   False   False
2   True    False
3   True    True

Expected

    a       b
0   False   NaN
1   False   NaN
2   True    NaN
3   True    True

I'd really like my arithmetics to keep track of where I had data and where not, how might I achieve that?

jezrael · Accepted Answer

df = df.apply(lambda x: x> 1).mask(df.isna())
#df = df.apply(lambda x: x> 1).where(df.notna())

For better performance avoid apply:

df = (df > 1).mask(df.isna())
#df = (df > 1).where(df.notna())

print (df)
       a    b
0  False  NaN
1  False  NaN
2   True  NaN
3   True  1.0

df = df.astype('boolean')
print (df)
       a     b
0  False  
1  False  
2   True  
3   True  True

Answers (1)