xirururu
xirururu

Reputation: 5508

How to find the closest smaller value in a large sorted array efficiently

For a sorted list, how can I find the smallest number which close to the a given number?

For example,

mysortedList = [37, 72, 235, 645, 715, 767, 847, 905, 908, 960]

How can I find the largest element which is less or equal to 700 quickly? (If I have 10 million elements, then it will be slow to search linearly). In this example, the answer is 645.

Upvotes: 5

Views: 2648

Answers (2)

OneCricketeer
OneCricketeer

Reputation: 191884

Manually implemented binary search

def find_largest_less_than(l, max_val, lo=0, hi=None):
    if hi is None:
        hi = len(l)

    if lo > hi: return None
    if hi - lo == 1: return None if max_val < l[lo] else l[lo]

    mid = lo + ((hi - lo) // 2)
    mid_val = l[mid]

    if max_val > mid_val:
        val = find_largest_less_than(l, max_val, mid, hi)
    elif max_val < mid_val:
        val = find_largest_less_than(l, max_val, 0, mid)

    return val

mysortedList = [37, 72, 235, 645, 715, 767, 847, 905, 908, 960]
my_max = 700

find_largest_less_than(mysortedList, my_max) # 645

Upvotes: 1

Simeon Visser
Simeon Visser

Reputation: 122496

You can use the bisect module:

import bisect

data = [37, 72, 235, 645, 715, 767, 847, 905, 908, 960]

location = bisect.bisect_left(data, 700)

result = data[location - 1]

This is a module in the standard library which will use binary search to find the desired result. Depending on the exact value that you need you can also use bisect_right instead of bisect_left.

This is faster than iterating over the list because the binary search algorithm can skip parts of the data that won't contain the answer. This makes it very suitable for finding the nearest number when the data is known to be sorted.

Upvotes: 10

Related Questions