kakk11
kakk11

Reputation: 918

Pandas, apply simple function to NaN returns value instead of NaN?

import pandas as pd
import numpy as np

pd.DataFrame(
  {'a':[0,1,2,3],
   'b':[np.nan, np.nan, np.nan,3]}
).apply(lambda x: x> 1)

returns me False for the column b, whereas I would like to get NaN?

    a       b
0   False   False
1   False   False
2   True    False
3   True    True

Expected

    a       b
0   False   NaN
1   False   NaN
2   True    NaN
3   True    True

I'd really like my arithmetics to keep track of where I had data and where not, how might I achieve that?

Upvotes: 1

Views: 230

Answers (1)

jezrael
jezrael

Reputation: 862681

Use DataFrame.mask or DataFrame.where with DataFrame.isna or DataFrame.notna:

df = df.apply(lambda x: x> 1).mask(df.isna())
#df = df.apply(lambda x: x> 1).where(df.notna())

For better performance avoid apply:

df = (df > 1).mask(df.isna())
#df = (df > 1).where(df.notna())

print (df)
       a    b
0  False  NaN
1  False  NaN
2   True  NaN
3   True  1.0

Last use nullable boolean:

df = df.astype('boolean')
print (df)
       a     b
0  False  <NA>
1  False  <NA>
2   True  <NA>
3   True  True

Upvotes: 1

Related Questions