Tania
Tania

Reputation: 428

How to efficiently find indexes of intervals in Python

I have a grid, and a list of values lying in the grid. How can I efficiently compute an index list for the values corresponding to the grid interval containing them. Here is a sample code:

xgrid = [304.0, 317.3, 330.7, 344.1, 357.4, 370.8]
xlist = [310, 320, 360]

output = []
for x in xlist:
    for xi in xgrid:
        if (xi < x):
            xindex = xi
    output.append(xindex)

print(output)

The expected output for this example is [304.0, 317.3, 357.4].

xgrid will be of size 50 or so, but xlist may be bigger and contain between 100-200 values.

Upvotes: 1

Views: 194

Answers (2)

popeye
popeye

Reputation: 886

You may try this, it would be time efficient.

xgrid = [304.0, 317.3, 330.7, 344.1, 357.4, 370.8]
xlist = [310, 320, 335]

print([xgrid[i] for i in range(len(xlist)) if xgrid[i] < xlist[i]])

Take the length of the list which is smaller and you can construct the if condition according to your problem.

Upvotes: 1

VirtualScooter
VirtualScooter

Reputation: 1888

The Python standard library offers bisect, which can be used to do the searching for you:

import bisect

xgrid = [304.0, 317.3, 330.7, 344.1, 357.4, 370.8]
xlist = [310, 320, 335]

def find_lt(a, x):
    'Find rightmost value less than x'
    i = bisect.bisect_left(a, x)
    if i:
        return a[i-1]
    raise ValueError

print([find_lt(xgrid, x) for x in xlist])
# Output: [304.0, 317.3, 330.7]

With respect to speed, I tried the following (append to code above):

import timeit

s = '''\
output = []
for x in xlist:
    for xi in xgrid:
        if (xi < x):
            xindex = xi
    output.append(xindex)
'''
s2 = '''\
output = [find_lt(xgrid, x) for x in xlist]
'''
print(timeit.timeit(s, number=100_000, globals=globals()))
print(timeit.timeit(s2, number=100_000, globals=globals()))

xgrid = [203.1, 207.2, 304.0, 317.3, 330.7,
         344.1, 357.4, 370.8, 400.1, 401.0]
xlist = [310, 320, 335, 399, 402]
print(timeit.timeit(s, number=100_000, globals=globals()))
print(timeit.timeit(s2, number=100_000, globals=globals()))

Output

0.11740579998877365
0.1047545000037644
0.28514970000833273
0.18074260000139475

This would indicate that the bisection algorithm is slightly faster for small lists, and likely becomes progressively better with larger lists.

Upvotes: 1

Related Questions