Reputation: 7486
I'm using np.isin() to calculate overlap between two values ... f.e.
np.isin(randint(0,10,3), randint(0,10,3)).sum()
the problem is I have a case where I need NULL value (all zero rows would be good candidate) :
z = np.array([0, 0, 0], dtype=np.uint16)
np.isin(z,array([0,2,3])).sum()
: 3
but the overlap should be ZERO not 3, because ZERO is real data. Currently I use null-value of 65535 i.e. -1, which I dont like very much :
z = np.array([0, 0, 0], dtype=np.uint16) + np.uint16(-1)
np.isin(z, np.array([0,2,3], dtype=np.uint16)).sum()
: 0
The problem as you see is that the NULL value can not be ZERO, because ZERO is value that is legitimate data..
Is there some standardized way of handling NIL/NULL data in numpy ?
I also should have mentioned the type should be np.uint16
In [137]: zz = np.zeros(5, dtype=np.uint16)
In [138]: zz
Out[138]: array([0, 0, 0, 0, 0], dtype=uint16)
In [139]: zz[:] = np.nan
In [140]: zz
Out[140]: array([0, 0, 0, 0, 0], dtype=uint16)
Upvotes: 1
Views: 2225
Reputation: 22989
You can use np.nan
:
>>> np.nan == np.nan
False
>>> z = np.array([np.nan, np.nan, np.nan])
>>> np.isin(z,np.array([np.nan,2,3])).sum()
0
Upvotes: 1