Reputation: 163
https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
If obj.ndim == x.ndim, x[obj] returns a 1-dimensional array filled with the elements of x corresponding to the True values of obj. The search order will be row-major, C-style. If obj has True values at entries that are outside of the bounds of x, then an index error will be raised. If obj is smaller than x it is identical to filling it with False.
I read from the numpy reference that I can index a larger array using a smaller boolean array ,and the rest entries would be automatically filled with False.
Example : From an array, select all rows which sum up to less or equal two:
>>> x = np.array([[0, 1], [1, 1], [2, 2]])
>>> rowsum = x.sum(-1)
>>> x[rowsum <= 2, :]
array([[0, 1],[1, 1]])
But if rowsum would have two dimensions as well:
>>> rowsum = x.sum(-1, keepdims=True)
>>> rowsum.shape
(3, 1)
>>> x[rowsum <= 2, :] # fails
IndexError: too many indices
>>> x[rowsum <= 2]
array([0, 1])
The last one giving only the first elements because of the extra dimension.
But the example simply doesn't work ,it says "IndexError: boolean index did not match indexed array along dimension 1; dimension is 2 but corresponding boolean dimension is 1"
How to make it work ?I'm using python 3.6.3 and numpy 1.13.3.
Upvotes: 3
Views: 1398
Reputation: 18628
From Numpy 11, It's not compatible with the new default behaviour : (boolean-indexing-changes) :
Boolean indexing changes.
...
...
Boolean indexes must match the dimension of the axis that they index.
...
Internals have been optimized, the docs not yet ....
Upvotes: 2
Reputation: 7649
I think what you are looking for is NumPy broadcasting.
import numpy as np
x = np.array([[0, 1], [1, 1], [2, 2]])
rowsum = x.sum(axis=1)
x[rowsum <= 2]
Gives:
array([[0, 1],
[1, 1]])
The problem is that you used keepdims=True
, which means the sum creates a column vector, rather than a rank one array which can be broadcasted.
Upvotes: 2