Reputation: 1189
I have a two 2D arrays, one of numbers and one of boolean values:
x =
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
[ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
[ 9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])
idx =
array([[False, False, False, False, False, False, False, False, False, False],
[False, True, True, True, True, True, False, False, False, False],
[False, True, True, True, True, True, False, False, False, False],
[False, True, True, True, True, True, False, False, False, False],
[False, False, False, True, True, True, True, False, False, False],
[False, False, False, False, True, True, True, False, False, False],
[False, False, False, False, False, False, True, False, False, False],
[False, False, False, False, False, False, False, True, False, False],
[False, False, False, False, False, False, False, False, False, False],
[False, False, False, False, False, False, False, False, False, False]], dtype=bool)
When I index the array it returns a 1D array:
x[idx]
array([ 1., 1., 1., 1., 1., 2., 2., 2., 2., 2., 3., 3., 3.,
3., 3., 4., 4., 4., 4., 5., 5., 5., 6., 7.])
How do I index the array and return a 2D array with the expected output:
x[idx]
array([[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.],
[ 4., 4., 4., 4.],
[ 5., 5., 5.],
[ 6.],
[ 7.]])
Upvotes: 6
Views: 3538
Reputation: 1189
EDIT:This creates an array of lists
np.array([val[idx[i]].tolist() for i,val in enumerate(x) if len(val[idx[i]].tolist()) > 0])
array([[1.0, 1.0, 1.0, 1.0, 1.0],
[2.0, 2.0, 2.0, 2.0, 2.0],
[3.0, 3.0, 3.0, 3.0, 3.0],
[4.0, 4.0, 4.0, 4.0],
[5.0, 5.0, 5.0],
[6.0],
[7.0]], dtype=object)
Upvotes: 0
Reputation: 11734
Your command returns a 1D array since it's impossible to fulfill without (a) destroying the column structure, which is usually needed. e.g., the 7
in your requested output originally belonged to column 7, and now it's on column 0; and (b) numpy
does not, afaik, support high dimensional array with different sizes on the same dimension. What I mean is that numpy can't have an array whose first three rows are of length 5, 4th row of length 4, etc. - all the rows (same dimension) need to have the same length.
I think the best result you could hope for is an array of arrays (and not a 2D array). This is how I would construct it, though there are probably better ways I don't know of:
In [9]: from itertools import izip
In [11]: array([r[ridx] for r, ridx in izip(x, idx) if ridx.sum() > 0])
Out[11]:
array([array([ 1., 1., 1., 1., 1.]), array([ 2., 2., 2., 2., 2.]),
array([ 3., 3., 3., 3., 3.]), array([ 4., 4., 4., 4.]),
array([ 5., 5., 5.]), array([ 6.]), array([ 7.])], dtype=object)
Upvotes: 4