Reputation: 8090
I would expect the result of a summation for a fully masked array to be zero, but instead "masked" is returned. How can I get the function to return zero?
>>> a = np.asarray([1, 2, 3, 4])
>>> b = np.ma.masked_array(a, mask=~(a > 2))
>>> b
masked_array(data = [-- -- 3 4],
mask = [ True True False False],
fill_value = 999999)
>>> b.sum()
7
>>> b = np.ma.masked_array(a, mask=~(a > 5))
>>> b
masked_array(data = [-- -- -- --],
mask = [ True True True True],
fill_value = 999999)
>>> b.sum()
masked
>>> np.ma.sum(b)
masked
>>>
Here's another unexpected thing:
>>> b.sum() + 3
masked
Upvotes: 4
Views: 3507
Reputation: 231395
In your last case:
In [197]: bs=b1.sum()
In [198]: bs.data
Out[198]: array(0.0)
In [199]: bs.mask
Out[199]: array(True, dtype=bool)
In [200]: repr(bs)
Out[200]: 'masked'
In [201]: str(bs)
Out[201]: '--'
If I specify keepdims
, I get a different array:
In [208]: bs=b1.sum(keepdims=True)
In [209]: bs
Out[209]:
masked_array(data = [--],
mask = [ True],
fill_value = 999999)
In [210]: bs.data
Out[210]: array([0])
In [211]: bs.mask
Out[211]: array([ True], dtype=bool)
here's the relevant part of the sum
code:
def sum(self, axis=None, dtype=None, out=None, keepdims=np._NoValue):
kwargs = {} if keepdims is np._NoValue else {'keepdims': keepdims}
_mask = self._mask
newmask = _check_mask_axis(_mask, axis, **kwargs)
# No explicit output
if out is None:
result = self.filled(0).sum(axis, dtype=dtype, **kwargs)
rndim = getattr(result, 'ndim', 0)
if rndim:
result = result.view(type(self))
result.__setmask__(newmask)
elif newmask:
result = masked
return result
....
It's the
newmask = np.ma.core._check_mask_axis(b1.mask, axis=None)
...
elif newmask: result = masked
lines that produce the masked
value in your case. newmask
is True in the case where all values are masked, and False is some are not. The choice to return np.ma.masked
is deliberate.
The core of the calculation is:
In [218]: b1.filled(0).sum()
Out[218]: 0
the rest of the code decides whether to return a scalar or masked array.
============
And for your addition:
In [232]: np.ma.masked+3
Out[232]: masked
It looks like the np.ma.masked
is a special array that propagates itself across calculations. Sort of like np.nan
.
Upvotes: 4