bastien girschig
bastien girschig

Reputation: 855

Bug (?) with numpy indexing

I was trying to achieve a kind of 2d filter with numpy, and I found something that looks to me like a bug.

In the example below, I'm trying to target the 2nd and 4th columns of the first, second and last lines of my data, ie:

[[ 2  4]
 [ 8 10]
 [26 28]] 

I am aware that the second to last line does return that, but I wouldn't be able to assign anything there (it returns a copy). And this still doesn't explain why the last one fails.

import numpy as np

# create my data: 5x6 array
data = np.arange(0,30).reshape(5,6)

# mask: only keep row 1,2,and 5
mask = np.array([1,1,0,0,1])
mask = mask.astype(bool)

# this is fine
print 'data\n', data, '\n'

# this is good
print 'mask\n', mask, '\n'

# this is nice
print 'data[mask]\n', data[mask], '\n'

# this is great
print 'data[mask, 2]\n', data[mask, 2], '\n'

# this is awesome
print 'data[mask][:,[2,4]]\n', data[mask][:,[2,4]], '\n'

# this fails ??
print 'data[mask, [2,4]]\n', data[mask, [2,4]], '\n'

output:

data
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]] 

mask
[ True  True False False  True] 

data[mask]
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [24 25 26 27 28 29]] 

data[mask, 2]
[ 2  8 26] 

data[mask][:,[2,4]]
[[ 2  4]
 [ 8 10]
 [26 28]] 

data[mask, [2,4]]
Traceback (most recent call last):
[...]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,)

I'm posting this here, because I'm not confident enough in my numpy skills to be sure this is a bug, and file a bug report...

Thanks for your help/feedback !

Upvotes: 2

Views: 321

Answers (1)

PilouPili
PilouPili

Reputation: 2699

This is not a bug.

This is an implementation definition

If you read array indexing in section Advanced Indexing you notice that it says

Purely integer array indexing When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing. Advanced indexes always are broadcast and iterated as one:

result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
                       ..., ind_N[i_1, ..., i_M]]

therefore

print 'data[mask, [2,4]]\n', data[mask, [1,2,4]], '\n'

works and outputs

data[mask, [1,2,4]]
[ 1  8 28]

index length in broadcasting must be the same

Maybe you can achieve what you want using ix_ function. See array indexing

columns = np.array([2, 4], dtype=np.intp)
print data[np.ix_(mask, columns)]

which outputs

[[ 2  4]
 [ 8 10]
 [26 28]]

Upvotes: 2

Related Questions