GoCurry
GoCurry

Reputation: 989

Element-wise comparison with NaNs as equal

If I run the following code:

dft1 = pd.DataFrame({'a':[1, np.nan, np.nan]})
dft2 = pd.DataFrame({'a':[1, 1, np.nan]})
dft1.a==dft2.a

The result is

0     True
1    False
2    False
Name: a, dtype: bool

How can I make the result to be

0     True
1    False
2     True
Name: a, dtype: bool

I.e., np.nan == np.nan evaluates to True.

I thought this is basic functionality and I must be asking a duplicate question, but I spent a lot of time search in SO or in Google and couldn't find it.

Upvotes: 18

Views: 4759

Answers (4)

piRSquared
piRSquared

Reputation: 294516

np.nan is defined to not be equal to np.nan.

Iterate

Check each pair to be equal or all np.nan

def naneq(t):
  return (t[0] == t[1]) or np.isnan(t).all()

[*map(naneq, zip(dft1.a, dft2.a))]

[True, False, True]

nunique

Count the unique values. Make sure to set argument dropna=False

pd.concat([dft1, dft2], axis=1).nunique(1, 0) == 1

0     True
1    False
2     True
dtype: bool

Upvotes: 2

user3483203
user3483203

Reputation: 51175

Using np.isclose with equal_nan=True:

np.isclose(dft1, dft2, equal_nan=True, rtol=0, atol=0)

array([[ True],
   [False],
   [ True]])

It's important to set both atol and rtol to zero to avoid equality assertions on similar values.

Upvotes: 9

cs95
cs95

Reputation: 403110

Can't think of a function that already does this for you (weird) so you can just do it yourself:

dft1.eq(dft2) | (dft1.isna() & dft2.isna())

       a
0   True
1  False
2   True

Note the presence of the parentheses. Precedence is a thing to watch out for when working with overloaded bitwise operators in pandas.

Another option is to use np.nan_to_num, if you are certain the index and columns of both DataFrames are identical so this result is valid:

np.nan_to_num(dft1) == np.nan_to_num(dft2)

array([[ True],
       [False],
       [ True]])

np.nan_to_num fills NaNs with some filler value (0 for numeric, 'nan' for string arrays).

Upvotes: 13

BENY
BENY

Reputation: 323376

Since np.nan is not equal to np.nan

np.nan==np.nan
Out[609]: False



dft1.a.fillna('NaN')==dft2.a.fillna('NaN')
Out[610]: 
0     True
1    False
2     True
Name: a, dtype: bool

Upvotes: 5

Related Questions