Reputation: 1070
Suppose we have an array of indices of another numpy array:
import numpy as np
a = np.array([0, 3, 1])
b = np.array([0, 10, 20, 30, 40, 50, 60, 70])
We can use the array a
as index directly:
b[a] # np.array([0, 30, 10])
But what if array a has more than one dimension? For example,
a = np.array([[0, 2], [1, 3], [2, 4]])
# I want to get b[a] = np.array([[0, 20], [10, 30], [20, 40]])
Numpy indexing doesn't work if the number of dimensions of a is greater than 1. We can achieve the desired result by using map
map(lambda x: b[x], a)
However, it is quite slow. For 1-dimensional case, direct indexing is about 10-100 times faster than using map
.
Is there a way to do it faster?
Upvotes: 3
Views: 2259
Reputation: 231615
What's the problem? I can index b
with a 2d array. The output just matches a1
in shape:
In [64]: b = np.array([0, 10, 20, 30, 40, 50, 60, 70])
In [65]: a1 = np.array([[0, 2], [1, 3], [2, 4]])
In [66]: b[a1]
Out[66]:
array([[ 0, 20],
[10, 30],
[20, 40]])
b[a1]
is not the same as b[a1[:,0],a1[:,1]]
. That is the 2 columns of a1
do not provide two indices (which would require a 2d b
).
Upvotes: 2
Reputation: 221674
There is built-in np.take
for exactly that same task -
np.take(b,a)
You can also flatten a
with .ravel()
, index into b
and reshape back to a's
shape -
b[a.ravel()].reshape(a.shape)
These NumPy based approaches would be much better than map(lambda x: b[x], a)
both in terms of both performance and memory as using map
would give us list of arrays.
Sample run -
In [34]: a
Out[34]:
array([[0, 2],
[1, 3],
[2, 4]])
In [35]: b
Out[35]: array([ 0, 10, 20, 30, 40, 50, 60, 70])
In [36]: np.take(b,a)
Out[36]:
array([[ 0, 20],
[10, 30],
[20, 40]])
In [37]: b[a.ravel()].reshape(a.shape)
Out[37]:
array([[ 0, 20],
[10, 30],
[20, 40]])
Runtime tests -
In [39]: a = np.random.randint(0,100,(200,100))
In [40]: b = np.random.randint(0,100,(20000))
In [41]: %timeit map(lambda x: b[x], a)
1000 loops, best of 3: 643 µs per loop
In [42]: %timeit np.take(b,a)
10000 loops, best of 3: 105 µs per loop
In [43]: %timeit b[a.ravel()].reshape(a.shape)
1000 loops, best of 3: 231 µs per loop
Upvotes: 1