Fast lookup from floating point index into array

Suppose I have n intervals of length len on an axis. Each of them is assigned a value in an ndarray. Now I'd like to look-up some values at positions query_pos given as floating point numbers. Currently, I plan to do it the following way:

n = 100
len = 0.39483
data = ... # creates ndarray with data of length n = 100
query_pos = ... # creates an ndarray with positions to query
values_at_query_pos = data[(query_pos / len).astype(int)] # questionable line

Is that a good way to do it or are there more efficient ways to convert a floating point query position into an integer index and then read from an array? I particularly wonder, wether astype(int) is a cheap or an expensive operation, compared for example to the division or the memory read.

Some more remarks:

Finally, it will be used in 2 and 3 dimensions. Currently, I plan to catch positions that would lead to illegal indices before they go into the lookup stage.
The data array will have a high-enough resolution so that I don't need any filtering or interpolation. That will be done in previous stages.

Upvotes: 0

Answers (2)

user6758673

Reputation: 589

You can output the result of the division straight to an int array:

idx = np.empty_like(query_pos, int)
np.divide(query_pos, len, out=idx, casting='unsafe')

This will be noticeably faster only for large arrays. But this code is harder to read, so only optimize if it's a bottleneck!

Upvotes: 1

Divakar

Reputation: 221674

Instead of dividing each element of query_pos by that scalar, we can pre-calculate the reciprocal of the scalar and use multiplication instead for some speedup there. The intuition is that division is a more costly affair than multiplication.

Here's a quick runtime test on it -

In [177]: # Setup inputs
     ...: n = 100
     ...: len1 = 0.39483
     ...: data = np.random.rand(100)
     ...: query_pos = np.random.randint(0,25,(100000))
     ...: 

In [178]: %timeit query_pos / len1
1000 loops, best of 3: 612 µs per loop

In [179]: %timeit query_pos * (1/len1)
10000 loops, best of 3: 173 µs per loop

Secondly, if there are many repeated indices, just like in the setup used for the runtime test shown earlier, we can use np.take for some further marginal improvement, as shown below -

In [196]: %timeit data[(query_pos *(1/len1)).astype(int)]
1000 loops, best of 3: 538 µs per loop

In [197]: %timeit np.take(data,(query_pos * (1/len1)).astype(int))
1000 loops, best of 3: 526 µs per loop

If you are planning to use it on generic ndarrays, we would need to use axis param with np.take.

Comparing it with the original approach -

In [202]: %timeit data[(query_pos / len1).astype(int)]
1000 loops, best of 3: 967 µs per loop

Finally, on the question of how the division operation stacks up against converting to int, they seem comparable on the big dataset. But, it seems you can't avoid the conversion as needed for indexing. Here's a timing test on it -

In [210]: idx = query_pos * (1/len1)

In [211]: %timeit query_pos * (1/len1)
10000 loops, best of 3: 165 µs per loop

In [212]: %timeit idx.astype(int)
10000 loops, best of 3: 110 µs per loop

Upvotes: 4

Fast lookup from floating point index into array

Answers (2)

Related Questions