Reputation: 2962
Say I have a two dimensional array of coordinates that looks something like
x = array([[1,2],[2,3],[3,4]])
Previously in my work so far, I generated a mask that ends up looking something like
mask = [False,False,True]
When I try to use this mask on the 2D coordinate vector, I get an error
newX = np.ma.compressed(np.ma.masked_array(x,mask))
>>>numpy.ma.core.MaskError: Mask and data not compatible: data size
is 6, mask size is 3.`
which makes sense, I suppose. So I tried to simply use the following mask instead:
mask2 = np.column_stack((mask,mask))
newX = np.ma.compressed(np.ma.masked_array(x,mask2))
And what I get is close:
>>>array([1,2,2,3])
to what I would expect (and want):
>>>array([[1,2],[2,3]])
There must be an easier way to do this?
Upvotes: 46
Views: 183776
Reputation: 3677
With np.where
you can do all sorts of things:
x_maskd = np.where(mask, x, 0)
np.where
takes three arguments, a condition
, x
, and y
. All three arguments must be broadcast-able to the same shape. In locations where mask
is True, the x
value is returned. Otherwise, the y
value is returned.
Upvotes: 15
Reputation: 128
masked_X = np.where(mask, X, 0)
is the fastest & the simplest way to mask a data :
X = np.array([[2,-1,4],
[3,-3,1],
[9,-7,2]])
mask = np.identity(3)
time measure :
%timeit np.where(mask,X,0)
969 ns ± 14.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit np.ma.array(X, mask=mask)
6.47 µs ± 85.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
I let you conclude !
Upvotes: 0
Reputation: 31
If you have
A = [[ 8. 0. 165. 22. 164. 47. 184. 185.]
[ 0. 6. -74. -27. 63. 49. -46. -48.]
[165. -74. 0. 0. 0. 0. 0. 0.]
[ 22. -27. 0. 0. 0. 0. 0. 0.]
[164. 63. 0. 0. 0. 0. 0. 0.]
[ 47. 49. 0. 0. 0. 0. 0. 0.]
[184. -46. 0. 0. 0. 0. 0. 0.]
[185. -48. 0. 0. 0. 0. 0. 0.]]
and your mask is
mask = np.array([True, True, True, False, True, False, True, False])
then your masked A becomes
A[mask, :][:, mask] = [[ 8. 0. 165. 164. 184.]
[ 0. 6. -74. 63. -46.]
[165. -74. 0. 0. 0.]
[164. 63. 0. 0. 0.]
[184. -46. 0. 0. 0.]]
Upvotes: 3
Reputation: 987
Since none of these solutions worked for me, I thought to write down what solution did, maybe it will useful for somebody else. I use python 3.x and I worked on two 3D arrays. One, which I call data_3D
contains float values of recordings in a brain scan, and the other, template_3D
contains integers which represent regions of the brain. I wanted to choose those values from data_3D
corresponding to an integer region_code
as per template_3D
:
my_mask = np.in1d(template_3D, region_code).reshape(template_3D.shape)
data_3D_masked = data_3D[my_mask]
which gives me a 1D array of only relevant recordings.
Upvotes: 3
Reputation: 231335
Your x
is 3x2:
In [379]: x
Out[379]:
array([[1, 2],
[2, 3],
[3, 4]])
Make a 3 element boolean mask:
In [380]: rowmask=np.array([False,False,True])
That can be used to select the rows where it is True, or where it is False. In both cases the result is 2d:
In [381]: x[rowmask,:]
Out[381]: array([[3, 4]])
In [382]: x[~rowmask,:]
Out[382]:
array([[1, 2],
[2, 3]])
This is without using the MaskedArray subclass. To make such array, we need a mask that matches x
in shape. There isn't provision for masking just one dimension.
In [393]: xmask=np.stack((rowmask,rowmask),-1) # column stack
In [394]: xmask
Out[394]:
array([[False, False],
[False, False],
[ True, True]], dtype=bool)
In [395]: np.ma.MaskedArray(x,xmask)
Out[395]:
masked_array(data =
[[1 2]
[2 3]
[-- --]],
mask =
[[False False]
[False False]
[ True True]],
fill_value = 999999)
Applying compressed
to that produces a raveled array: array([1, 2, 2, 3])
Since masking is element by element, it could mask one element in row 1, 2 in row 2 etc. So in general compressing
, removing the masked elements, will not yield a 2d array. The flattened form is the only general choice.
np.ma
makes most sense when there's a scattering of masked values. It isn't of much value if you want want to select, or deselect, whole rows or columns.
===============
Here are more typical masked arrays:
In [403]: np.ma.masked_inside(x,2,3)
Out[403]:
masked_array(data =
[[1 --]
[-- --]
[-- 4]],
mask =
[[False True]
[ True True]
[ True False]],
fill_value = 999999)
In [404]: np.ma.masked_equal(x,2)
Out[404]:
masked_array(data =
[[1 --]
[-- 3]
[3 4]],
mask =
[[False True]
[ True False]
[False False]],
fill_value = 2)
In [406]: np.ma.masked_outside(x,2,3)
Out[406]:
masked_array(data =
[[-- 2]
[2 3]
[3 --]],
mask =
[[ True False]
[False False]
[False True]],
fill_value = 999999)
Upvotes: 9
Reputation: 114781
In your last example, the problem is not the mask. It is your use of compressed
. From the docstring of compressed
:
Return all the non-masked data as a 1-D array.
So compressed
flattens the nonmasked values into a 1-d array. (It has to, because there is no guarantee that the compressed data will have an n-dimensional structure.)
Take a look at the masked array before you compress it:
In [8]: np.ma.masked_array(x, mask2)
Out[8]:
masked_array(data =
[[1 2]
[2 3]
[-- --]],
mask =
[[False False]
[False False]
[ True True]],
fill_value = 999999)
Upvotes: 1
Reputation: 214927
Is this what you are looking for?
import numpy as np
x[~np.array(mask)]
# array([[1, 2],
# [2, 3]])
Or from numpy masked array:
newX = np.ma.array(x, mask = np.column_stack((mask, mask)))
newX
# masked_array(data =
# [[1 2]
# [2 3]
# [-- --]],
# mask =
# [[False False]
# [False False]
# [ True True]],
# fill_value = 999999)
Upvotes: 32