Maikefer
Maikefer

Reputation: 580

Pythonic way of retrieving elements of 2D array by indices

Say I have this array of data

data = np.asarray([[1, 2, 3, 4], ['a', 'b', 'c', 'd'], ['A', 'B', 'C', 'D'], ['x', 'y', 'z', 'zz']])

and these indices, each "pair" corresponding to one cell in the data matrix

indices = np.asarray([[0, 0], [3, 0], [2, 2], [3, 2]])

Now I want to retrieve the data from the specified cells. I can do this via:

searched_data = []
for x, y in coords:
    searched_data.append(data[y][x])

Is there a more pythonic or more numpy-ish variation where I can do this in one line by fancy array indexing or something?

I tried (inspired by this post):

x_indexed1 = data[indices[:, 1]][:,[indices[:, 0]]]

but this gives me

[[['1' '4' '3' '4']]

 [['1' '4' '3' '4']]

 [['A' 'D' 'C' 'D']]

 [['A' 'D' 'C' 'D']]]

and this

x_indexed = data[np.ix_(indices[:, 1],indices[:, 0])]

which gives

[['1' '4' '3' '4']
 ['1' '4' '3' '4']
 ['A' 'D' 'C' 'D']
 ['A' 'D' 'C' 'D']]

Upvotes: 1

Views: 66

Answers (1)

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95948

You were close, but when you want to index into a numpy.ndarray with aligned indices like that, don't use [][]. Use a tuple to do multidimensional indexing:

>>> data[indices[:, 1], indices[:,0]]
array(['1', '4', 'C', 'D'], dtype='<U21')

To make it more clear:

>>> ys = indices[:, 1]
>>> xs = indices[:, 0]
>>> data[ys, xs]
array(['1', '4', 'C', 'D'], dtype='<U21')

Your first attempt was something like this:

>>> data[ys][:,[xs]]
array([[['1', '4', '3', '4']],

       [['1', '4', '3', '4']],

       [['A', 'D', 'C', 'D']],

       [['A', 'D', 'C', 'D']]], dtype='<U21')

So, breaking that down, "partial indexing" assumes a full slice : in the dimensions you left out, so for you data[ys, :] which selects these rows:

>>> data[ys]
array([['1', '2', '3', '4'],
       ['1', '2', '3', '4'],
       ['A', 'B', 'C', 'D'],
       ['A', 'B', 'C', 'D']], dtype='<U21')

And that's what you indexed with [:, [xs]], which basically selects all rows and those xs's columns, which you wrapped in a list, which basically unsqueezes a dimension:

>>> data[ys][...,[xs]]
array([[['1', '4', '3', '4']],

       [['1', '4', '3', '4']],

       [['A', 'D', 'C', 'D']],

       [['A', 'D', 'C', 'D']]], dtype='<U21')

Upvotes: 2

Related Questions