Rtut
Rtut

Reputation: 1007

Pandas pd.isnull() function

I need to replace not null values in my dataframe with 1 and null values with 0.

Here is my dataframe:

my_list= [['a','b','c'],['test1','test2',None],[None,'101','000']]

mydf= pd.DataFrame(my_list,columns=['col1','col2','col3'])

mydf

    col1   col2  col3
0      a      b     c
1  test1  test2  None
2   None    101   000

mydf.where((pd.isnull(mydf)),0,inplace=True)

mydf

   col1 col2  col3
0     0    0     0
1     0    0  None
2  None    0     0

I am not sure why it is replacing not null values with zero. pd.notnull() does the opposite. Can anyone explain what I am missing here?

Upvotes: 3

Views: 12964

Answers (2)

root
root

Reputation: 33793

This is the expected behavior for where. According to the docs, where keeps values that are True and replaces values that are False, and pd.isnull will return True only for the None entries, which is why they were the only ones that were kept.

You either want to use the mask function with pd.isnull:

mydf.mask(pd.isnull(mydf), 0, inplace=True)

Or you want to use where with pd.notnull:

mydf.where(pd.notnull(mydf), 0, inplace=True)

Regardless, @piRSquared's method is probably better than either of the above.

Upvotes: 1

piRSquared
piRSquared

Reputation: 294338

Just do:

mydf = mydf.notnull() * 1
mydf

enter image description here

For completeness

mydf.isnull() * 1

enter image description here

Upvotes: 5

Related Questions