Reputation: 3190
I have a pandas DataFrame with mixed data types. I would like to replace all null values with None (instead of default np.nan). For some reason, this appears to be nearly impossible.
In reality my DataFrame is read in from a csv, but here is a simple DataFrame with mixed data types to illustrate my problem.
df = pd.DataFrame(index=[0], columns=range(5))
df.iloc[0] = [1, 'two', np.nan, 3, 4]
I can't do:
>>> df.fillna(None)
ValueError: must specify a fill method or value
nor:
>>> df[df.isnull()] = None
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
nor:
>>> df.replace(np.nan, None)
TypeError: cannot replace [nan] with method pad on a DataFrame
I used to have a DataFrame with only string values, so I could do:
>>> df[df == ""] = None
which worked. But now that I have mixed datatypes, it's a no go.
For various reasons about my code, it would be helpful to be able to use None as my null value. Is there a way I can set the null values to None? Or do I just have to go back through my other code and make sure I'm using np.isnan or pd.isnull everywhere?
Upvotes: 54
Views: 76310
Reputation: 708
Found this helpful for replacing NaN values with None: df.replace({pd.np.nan: None})
Upvotes: 4
Reputation: 1493
Expanding on the accpeted answer.. When you also need to catch NaN
values within numeric dtype columns, you may need to change dtype to object
first:
df.astype(object).where(df.notna(), None)
as per original reply by @BENNY
Upvotes: 8
Reputation: 294488
Use pd.DataFrame.where
Uses df
value when condition is met, otherwise uses None
df.where(df.notnull(), None)
Upvotes: 99