Andrii Furmanets
Andrii Furmanets

Reputation: 1161

How to proceed with `None` value in pandas fillna

I have the following dictionary:

fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})

When I pass that dictionary to fillna I see:

raise ValueError('must specify a fill method or value')\nValueError: must specify a fill method or value\n"

It seems to me that it fails on None value.

I use pandas version 0.20.3.

Upvotes: 59

Views: 84890

Answers (6)

Nikhil Malladi
Nikhil Malladi

Reputation: 11

Using pandas where method, the NaN values can be replaced with None in the Dataframe:

df.where(pd.notnull(df), None)

Upvotes: 1

smci
smci

Reputation: 33970

Solution: use pandas pd.NA not base Python None

df = pd.DataFrame({'first_name':pd.NA, 'last_name':pd.NA, 'created_at':pd.NA}, index=[0])

df = df.fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':pd.NA})

Generally it's better to leave pandas NA as-is. Do not try to change it. The presence of NA is a feature, not an issue. NA gets handled correctly in other pandas functions (but not numpy)

  • If you insist that python None should replace pandas NA's for some downstream reason, show us the missing code that follows where NA is causing an issue; that's usually an XY problem.

Upvotes: -1

AsaridBeck91
AsaridBeck91

Reputation: 1466

In case you want to normalize all of the nulls with python's None.

df.fillna(np.nan).replace([np.nan], [None])

The first fillna will replace all of (None, NAT, np.nan, etc) with Numpy's NaN, then replace Numpy's NaN with python's None.

Upvotes: 104

addicted
addicted

Reputation: 3041

An alternative method to fillna with None. I am on pandas 0.24.0 and I am doing this to insert NULL values to POSTGRES database.

# Stealing @pIRSquared dataframe
df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

# fill NaN with None. Basically it says, fill with None whenever you see NULL value.
df['A'] = np.where(df['A'].isnull(), None, df['A'])
df['B'] = np.where(df['B'].isnull(), None, df['B'])

# Result
df

     A    B     C
0  1.0  None  None
1  None  2.0     D

Upvotes: 2

piRSquared
piRSquared

Reputation: 294508

Setup
Consider the sample dataframe df

df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

I can confirm the error

df.fillna(dict(A=1, B=None, C=4))
ValueError: must specify a fill method or value

This happens because pandas is cycling through keys in the dictionary and executing a fillna for each relevant column. If you look at the signature of the pd.Series.fillna method

Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

You'll see the default value is None. So we can replicate this error with

df.A.fillna(None)

Or equivalently

df.A.fillna()

I'll add that I'm not terribly surprised considering that you are attempting to fill a null value with a null value.


What you need is a work around

Solution
Use pd.DataFrame.fillna over columns that you want to fill with non-null values. Then follow that up with a pd.DataFrame.replace on the specific columns you want to swap one null value with another.

df.fillna(dict(A=1, C=2)).replace(dict(B={np.nan: None}))

     A     B  C
0  1.0  None  2
1  1.0     2  D

Upvotes: 23

atwalsh
atwalsh

Reputation: 3722

What type of data structure are you using? This works for a pandas Series:

import pandas as pd

d = pd.Series({'first_name': 'Andrii', 'last_name':'Furmanets', 'created_at':None})
d = d.fillna('DATE')

Upvotes: 4

Related Questions