anderstood
anderstood

Reputation: 362

Efficiently extract values from array using a list of index

Given a 2D NumPy array a and a list of indices stored in index, there must be a way of extracting the values of the list very efficiently. Using a for loop as follows take about 5 ms which seems extremely slow for 2000 elements to extract:

import numpy as np
import time

# generate dummy array 
a = np.arange(4000).reshape(1000, 4) 
# generate dummy list of indices
r1 = np.random.randint(1000, size=2000)
r2 = np.random.randint(3, size=2000)
index = np.concatenate([[r1], [r2]]).T

start = time.time()
result = [a[i, j] for [i, j] in index]
print time.time() - start

How can I increase the extraction speed? np.take does not seem appropriate here because it would return a 2D array instead of a 1D array.

Upvotes: 2

Views: 457

Answers (2)

miradulo
miradulo

Reputation: 29690

Another option would be numpy.ravel_multi_index, which lets you avoid the manual indexing.

np.ravel_multi_index(index.T, a.shape)

Upvotes: 2

akuiper
akuiper

Reputation: 214957

You can use advanced indexing which basically means extract the row and column indices from the index array and then use it to extract values from a, i.e. a[index[:,0], index[:,1]] -

%timeit a[index[:,0], index[:,1]]
# 12.1 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit [a[i, j] for [i, j] in index]
# 2.22 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 2

Related Questions