user3601754
user3601754

Reputation: 3862

Python - replace masked data in arrays

I would like to replace by zero value all my masked values in 2D array. I saw with np.copyto it was apparently possible to do that as :

test=np.copyto(array, 0, where = mask)

But i have an error message...'module' object has no attribute 'copyto'. Is there an equivalent way to do that?

Upvotes: 0

Views: 8970

Answers (2)

Ishan Tomar
Ishan Tomar

Reputation: 1554

Try numpy.ma.filled() I think this is exactly what you need

In [29]: a
Out[29]: array([ 1,  0, 25,  0,  1,  4,  0,  2,  3,  0])
In [30]: am = n.ma.MaskedArray(n.ma.log(a),fill_value=0)
In [31]: am
Out[31]: 
masked_array(data = [0.0 -- 3.2188758248682006 -- 0.0 1.3862943611198906 --  0.6931471805599453  1.0986122886681098 --], 
mask = [False  True False  True False False  True False False  True],
fill_value = 0.0)
In [32]: am.filled()
Out[32]: 
array([ 0.        ,  0.        ,  3.21887582,  0.        ,  0.        ,
    1.38629436,  0.        ,  0.69314718,  1.09861229,  0.        ])

Upvotes: 4

unutbu
unutbu

Reputation: 879113

test = np.copyto(array, 0, where=mask) is equivalent to:

array = np.where(mask, 0, array)
test = None

(I'm not sure why you would want to assign a value to the return value of np.copyto; it always returns None if no Exception is raised.)


Why not use array[mask] = 0?

Indeed, that would work (and has nicer syntax) if mask is a boolean array with the same shape as array. If mask doesn't have the same shape then array[mask] = 0 and np.copyto(array, 0, where=mask) may behave differently:

np.copyto (is documented to) and np.where (appears to) broadcast the shape of the mask to match array. In contrast, array[mask] = 0 does not broadcast mask. This leads to a big difference in behavior when the mask does not have the same shape as array:

In [60]: array = np.arange(12).reshape(3,4)

In [61]: mask = np.array([True, False, False, False], dtype=bool)

In [62]: np.where(mask, 0, array)
Out[62]: 
array([[ 0,  1,  2,  3],
       [ 0,  5,  6,  7],
       [ 0,  9, 10, 11]])

In [63]: array[mask] = 0

In [64]: array
Out[64]: 
array([[ 0,  0,  0,  0],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

When array is 2-dimensional and mask is a 1-dimensional boolean array, array[mask] is selecting rows of array (where mask is True) and array[mask] = 0 sets those rows to zero.

Surprisingly, array[mask] does not raise an IndexError even though the mask has 4 elements and array only has 3 rows. No IndexError is raised when the fourth value is False, but an IndexError is raised if the fourth value is True:

In [91]: array[np.array([True, False, False, False])]
Out[91]: array([[0, 1, 2, 3]])

In [92]: array[np.array([True, False, False, True])]
IndexError: index 3 is out of bounds for axis 0 with size 3

Upvotes: 1

Related Questions