Reputation: 855
I was trying to achieve a kind of 2d filter with numpy, and I found something that looks to me like a bug.
In the example below, I'm trying to target the 2nd and 4th columns of the first, second and last lines of my data, ie:
[[ 2 4]
[ 8 10]
[26 28]]
I am aware that the second to last line does return that, but I wouldn't be able to assign anything there (it returns a copy). And this still doesn't explain why the last one fails.
import numpy as np
# create my data: 5x6 array
data = np.arange(0,30).reshape(5,6)
# mask: only keep row 1,2,and 5
mask = np.array([1,1,0,0,1])
mask = mask.astype(bool)
# this is fine
print 'data\n', data, '\n'
# this is good
print 'mask\n', mask, '\n'
# this is nice
print 'data[mask]\n', data[mask], '\n'
# this is great
print 'data[mask, 2]\n', data[mask, 2], '\n'
# this is awesome
print 'data[mask][:,[2,4]]\n', data[mask][:,[2,4]], '\n'
# this fails ??
print 'data[mask, [2,4]]\n', data[mask, [2,4]], '\n'
output:
data
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]]
mask
[ True True False False True]
data[mask]
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[24 25 26 27 28 29]]
data[mask, 2]
[ 2 8 26]
data[mask][:,[2,4]]
[[ 2 4]
[ 8 10]
[26 28]]
data[mask, [2,4]]
Traceback (most recent call last):
[...]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)
I'm posting this here, because I'm not confident enough in my numpy skills to be sure this is a bug, and file a bug report...
Thanks for your help/feedback !
Upvotes: 2
Views: 321
Reputation: 2699
This is not a bug.
This is an implementation definition
If you read array indexing in section Advanced Indexing you notice that it says
Purely integer array indexing When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing. Advanced indexes always are broadcast and iterated as one:
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
therefore
print 'data[mask, [2,4]]\n', data[mask, [1,2,4]], '\n'
works and outputs
data[mask, [1,2,4]]
[ 1 8 28]
index length in broadcasting must be the same
Maybe you can achieve what you want using ix_
function. See array indexing
columns = np.array([2, 4], dtype=np.intp)
print data[np.ix_(mask, columns)]
which outputs
[[ 2 4]
[ 8 10]
[26 28]]
Upvotes: 2