Unable to handle NaN in pandas dataframe

Question

I have a pandas dataframe with a variable, which, when I print it, shows up as mostly containing NaN. It is of dtype object. However, when I run the isnull function, it returns "FALSE" everywhere. I am wondering why the NaN values are not encoded as missing, and if there is any way of converting them to missing values that are treated properly.

Thank you.

piRSquared · Accepted Answer

Your NaN are strings

df = pd.DataFrame(dict(A=['Not NaN', 'NaN', np.nan]))
print(df)

         A
0  Not NaN
1      NaN
2      NaN

What's missing

print(df.isnull())

       A
0  False
1  False
2   True

The strings are not missing, the np.nan are.

You can mask the strings with

df.A.mask(df.A.eq('NaN')).isnull()

0    False
1     True
2     True
Name: A, dtype: bool

Unable to handle NaN in pandas dataframe

Answers (2)

Related Questions