Michael
Michael

Reputation: 7839

Fast lookup from floating point index into array

Suppose I have n intervals of length len on an axis. Each of them is assigned a value in an ndarray. Now I'd like to look-up some values at positions query_pos given as floating point numbers. Currently, I plan to do it the following way:

n = 100
len = 0.39483
data = ... # creates ndarray with data of length n = 100
query_pos = ... # creates an ndarray with positions to query
values_at_query_pos = data[(query_pos / len).astype(int)] # questionable line

Is that a good way to do it or are there more efficient ways to convert a floating point query position into an integer index and then read from an array? I particularly wonder, wether astype(int) is a cheap or an expensive operation, compared for example to the division or the memory read.

Some more remarks:

Upvotes: 0

Views: 366

Answers (2)

user6758673
user6758673

Reputation: 589

You can output the result of the division straight to an int array:

idx = np.empty_like(query_pos, int)
np.divide(query_pos, len, out=idx, casting='unsafe')

This will be noticeably faster only for large arrays. But this code is harder to read, so only optimize if it's a bottleneck!

Upvotes: 1

Divakar
Divakar

Reputation: 221674

Instead of dividing each element of query_pos by that scalar, we can pre-calculate the reciprocal of the scalar and use multiplication instead for some speedup there. The intuition is that division is a more costly affair than multiplication.

Here's a quick runtime test on it -

In [177]: # Setup inputs
     ...: n = 100
     ...: len1 = 0.39483
     ...: data = np.random.rand(100)
     ...: query_pos = np.random.randint(0,25,(100000))
     ...: 

In [178]: %timeit query_pos / len1
1000 loops, best of 3: 612 µs per loop

In [179]: %timeit query_pos * (1/len1)
10000 loops, best of 3: 173 µs per loop

Secondly, if there are many repeated indices, just like in the setup used for the runtime test shown earlier, we can use np.take for some further marginal improvement, as shown below -

In [196]: %timeit data[(query_pos *(1/len1)).astype(int)]
1000 loops, best of 3: 538 µs per loop

In [197]: %timeit np.take(data,(query_pos * (1/len1)).astype(int))
1000 loops, best of 3: 526 µs per loop

If you are planning to use it on generic ndarrays, we would need to use axis param with np.take.

Comparing it with the original approach -

In [202]: %timeit data[(query_pos / len1).astype(int)]
1000 loops, best of 3: 967 µs per loop

Finally, on the question of how the division operation stacks up against converting to int, they seem comparable on the big dataset. But, it seems you can't avoid the conversion as needed for indexing. Here's a timing test on it -

In [210]: idx = query_pos * (1/len1)

In [211]: %timeit query_pos * (1/len1)
10000 loops, best of 3: 165 µs per loop

In [212]: %timeit idx.astype(int)
10000 loops, best of 3: 110 µs per loop

Upvotes: 4

Related Questions