Reputation: 5027
I have a masked numpy array like the following:
mar = np.ma.array([0, 0, 100, 100], mask=[False, True, True, False], fill_value=-1)
So the two values in the middle are masked, calling mar.filled()
would return [0, -1, -1, 100]
.
I want to compare this array to a scalar 0
, i.e.:
mar == 0
which returns
masked_array(data = [True -- -- False],
mask = [False True True False],
fill_value = True)
Note that the fill_value
is now True
which is the default fill value for bool arrays but does not make sense for me in this case (I would have expected that it is set to -1 == 0
which is False
).
To illustrate my problem more clearly: (mar == 0).filled()
and mar.filled() == 0
do not return the same result.
Is this intended behaviour or is it a bug? In any case, is there a workaround to achieve my desired behaviour? I know that I can just convert to a normal array before comparison using .filled()
but I would like to avoid that if possible, since the code should not care whether it is a masked array or a normal one.
Upvotes: 0
Views: 990
Reputation: 231605
mar == 0
uses mar.__eq__(0)
docs for that method say:
When either of the elements is masked, the result is masked as well, but the underlying boolean data are still set, with self and other considered equal if both are masked, and unequal otherwise.
That method in turn uses mar._comparison
This first performs the comparison on the .data
attributes
In [16]: mar.data
Out[16]: array([ 0, 0, 100, 100])
In [17]: mar.data == 0
Out[17]: array([ True, True, False, False])
But then it compares the masks and adjusts values. 0 is not masked, so its 'mask' is False
. Since the mask for the masked elements of mar
is True, the masks don't match, and the comparison .data
is set to False.
In [19]: np.ma.getmask(0)
Out[19]: False
In [20]: mar.mask
Out[20]: array([False, True, True, False])
In [21]: (mar==0).data
Out[21]: array([ True, False, False, False])
I get a different fill_value
in the comparison. That could be a change in v 1.14.0.
In [24]: mar==0
Out[24]:
masked_array(data=[True, --, --, False],
mask=[False, True, True, False],
fill_value=-1)
In [27]: (mar==0).filled()
Out[27]: array([True, -1, -1, False], dtype=object)
This is confusing. Comparisons (and in general most functions) on masked arrays have to deal with the .data
, the mask, and the fill. Numpy code that isn't ma
aware usually works the .data
and ignores the masking. ma
methods may work with the filled()
values, or the compressed
. This comparison
method attempts to take all 3 attributes into account.
Testing the equality with a masked 0 array (same mask and fillvalue):
In [34]: mar0 = np.ma.array([0, 0, 0, 0], mask=[False, True, True, False], fill_
...: value=-1)
In [35]: mar0
Out[35]:
masked_array(data=[0, --, --, 0],
mask=[False, True, True, False],
fill_value=-1)
In [36]: mar == mar0
Out[36]:
masked_array(data=[True, --, --, False],
mask=[False, True, True, False],
fill_value=-1)
In [37]: _.data
Out[37]: array([ True, True, True, False])
mar == 0
is the same as mar == np.ma.array([0, 0, 0, 0], mask=False)
Upvotes: 3
Reputation: 8059
I don't know why (mar == 0)
does not yield the desired output. But you can consider
np.equal(mar, 0)
which retain the original fill value.
Upvotes: 0