MarsPlus
MarsPlus

Reputation: 321

A little confused about binary search

I implemented a binary search in Python:

def bisect(seq, goal, lo=0, hi=None):
    if hi == None:
        hi = len(seq)
    while True:
        middle  = (lo+hi)//2
        if seq[middle] == goal:
            return middle
        elif goal < seq[middle]: hi = middle
        elif goal > seq[middle]: lo = middle+1

        if lo >= hi:
            return -1

It should return the index of the item that is first met. However, when I apply it on a list like this:

seq = [-81, -81, 1, 2, 9, 10, 63, 79]
bisect(seq, -81)

it doesn't return 0 but 1. How can I fix this?

Upvotes: 2

Views: 158

Answers (2)

Don O&#39;Donnell
Don O&#39;Donnell

Reputation: 4728

For such a seemingly simple problem, nonetheless, getting the boundary conditions exactly right can be a challenge. So what I normally do in these cases is just use, or more usually, copy and adjust the code in the bisect module. The function you want is bisect_left, since you want the index of the leftmost value if there are more than one, or the index of the insertion point if there is no match.

Here is a copy of the bisect_left function from the Python 3.3 Std Lib:

def bisect_left(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e < x, and all e in
    a[i:] have e >= x.  So if x already appears in the list, a.insert(x) will
    insert just before the leftmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

Upvotes: 1

hobbs
hobbs

Reputation: 239891

if seq[middle] == goal: return middle bails out without considering whether the same value might occur at a lower index. In your example, lo stays 0, hi becomes 7, then 3. When hi is 3, middle is 1, and that meets your condition so 1 is returned. Since any multiple occurrences of goal have to be consecutive to meet the condition that seq is nondecreasing (required for binary search), the easiest thing to do might be:

if seq[middle] == goal:
  while middle > lo and seq[middle - 1] == goal:
    middle = middle - 1
  return middle

Upvotes: 1

Related Questions