Reputation: 446
Replace all the NaN
values in a dataframe with None
In[6]: import pandas as pd
In[7]: import numpy as np
In[8]: df = pd.DataFrame({"a":[1,np.nan],"b":[np.nan,"foo"]})
In[9]: df
Out[9]:
a b
0 1.0 NaN
1 NaN foo
In[10]: pd.notnull(df)
Out[10]:
a b
0 True False
1 False True
In[11]: df.where(pd.notnull(df), None)
Out[11]:
a b
0 1.0 None
1 NaN foo
In[11]: df.where(pd.notnull(df), None)
Out[11]:
a b
0 1.0 None
1 None foo
I have tested this on another machine with Python 3.8.5 and pandas==1.1.1, and it worked as expected. Is this a bug?
Thank you!
Upvotes: 1
Views: 1863
Reputation: 51165
This is not a bug. In fact, the result you are seeing in pandas==1.1.1
is a bug, that has been fixed in later versions by PR39761.
The fix is also mentioned in the 1.3.0 release notes.
In general, pandas
will try to cast to avoid results that contain object
dtype columns, and this is no exception. If you would like to force the cast, you can use:
>>> df.astype(object).where(pd.notnull(df), None)
a b
0 1.0 None
1 None foo
It seems as though there has been some grumbling in the community about this bug-fix, discussed here.
Upvotes: 3