Reputation: 3497
I need to compare some numpy arrays which should have the same elements in the same order, excepting for some NaN values in the second one.
I need a function more or less like this:
def func( array1, array2 ):
if ???:
return True
else:
return False
Example:
x = np.array( [ 1, 2, 3, 4, 5 ] )
y = np.array( [ 11, 2, 3, 4, 5 ] )
z = np.array( [ 1, 2, np.nan, 4, 5] )
func( x, z ) # returns True
func( y, z ) # returns False
The arrays have always the same length and the NaN values are always in the third one (x and y have always numbers only). I can imagine there is a function or something already, but I just don't find it.
Any ideas?
Upvotes: 5
Views: 3319
Reputation: 18806
numpy.islcose()
now provides an argument equal_nan
for this case!
>>> import numpy as np
>>> np.isclose([1.0, np.nan], [1.0, np.nan])
array([ True, False])
>>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
array([ True, True])
docs https://numpy.org/doc/stable/reference/generated/numpy.isclose.html
Upvotes: 1
Reputation: 879729
You could use isclose
to check for equality (or closeness to within a given tolerance -- this is particularly useful when comparing floats) and use isnan
to check for NaNs in the second array.
Combine the two with bitwise-or (|
), and use all
to demand every pair is either close or contains a NaN to obtain the desired result:
In [62]: np.isclose(x,z)
Out[62]: array([ True, True, False, True, True], dtype=bool)
In [63]: np.isnan(z)
Out[63]: array([False, False, True, False, False], dtype=bool)
So you could use:
def func(a, b):
return (np.isclose(a, b) | np.isnan(b)).all()
In [67]: func(x, z)
Out[67]: True
In [68]: func(y, z)
Out[68]: False
Upvotes: 2
Reputation: 97601
You can use masked arrays, which have the behaviour you're asking for when combined with np.all
:
zm = np.ma.masked_where(np.isnan(z), z)
np.all(x == zm) # returns True
np.all(y == zm) # returns False
Or you could just write out your logic explicitly, noting that numpy has to use |
instead of or
, and the difference in operator precedence that results:
def func(a, b):
return np.all((a == b) | np.isnan(a) | np.isnan(b))
Upvotes: 6
Reputation: 476709
What about:
from math import isnan
def fun(array1,array2):
return all(isnan(x) or isnan(y) or x == y for x,y in zip(array1,array2))
This function works in both directions (if there are NaN
s in the first list, these are also ignored). If you do not want that (which is a bit odd since equality usually works bidirectional). You can define:
from math import isnan
def fun(array1,array2):
return all(isnan(y) or x == y for x,y in zip(array1,array2))
The code works as follows: we use zip
to emit tuples of elements of both arrays. Next we check if either the element of the first list is NaN, or the second, or they are equal.
Given you want to write a really elegant function, you better also perform a length check:
from math import isnan
def fun(array1,array2):
return len(array1) == len(array2) and all(isnan(y) or x == y for x,y in zip(array1,array2))
Upvotes: 1