Reputation: 362
Given a 2D NumPy array a
and a list of indices stored in index
, there must be a way of extracting the values of the list very efficiently. Using a for loop as follows take about 5 ms which seems extremely slow for 2000 elements to extract:
import numpy as np
import time
# generate dummy array
a = np.arange(4000).reshape(1000, 4)
# generate dummy list of indices
r1 = np.random.randint(1000, size=2000)
r2 = np.random.randint(3, size=2000)
index = np.concatenate([[r1], [r2]]).T
start = time.time()
result = [a[i, j] for [i, j] in index]
print time.time() - start
How can I increase the extraction speed? np.take
does not seem appropriate here because it would return a 2D array instead of a 1D array.
Upvotes: 2
Views: 457
Reputation: 29690
Another option would be numpy.ravel_multi_index
, which lets you avoid the manual indexing.
np.ravel_multi_index(index.T, a.shape)
Upvotes: 2
Reputation: 214957
You can use advanced indexing which basically means extract the row and column indices from the index
array and then use it to extract values from a
, i.e. a[index[:,0], index[:,1]]
-
%timeit a[index[:,0], index[:,1]]
# 12.1 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit [a[i, j] for [i, j] in index]
# 2.22 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Upvotes: 2