Issues with Binary Search Algorithm

Question

I have an assignment to write a binary search that returns the first iteration of the value we are looking for. I've been doing some research online and my search looks a lot like what i'm finding but i'm having an issue. If I pass this code an array that looks like this {10,5,5,3,2} it find the 5 at in the middle(The first thing it checks) and then just returns it. But that is not the first iteration of the 5 it is the second. What am I doing wrong? Is this even possible?

Thanks in advance!

The code(I'm using Java):

public static int binarySearch(int[] arr, int v){
    int lo = 0;
    int hi = arr.length-1;
    while(lo <= hi){
        int middle = (lo+hi)/2;
        if(v == arr[middle]){
            return middle;
        }
        else
        {
            if(v < arr[middle]){
                lo = middle+1;
            }  
            else
            {
                hi = middle-1;
            }
        }
    }
    return -1;
}

Cătălin Fr&#226;ncu · Accepted Answer

Here is a modified algorithm that works.

public static int binarySearch(int[] arr, int v) {
  int lo = -1;
  int hi = arr.length - 1;

  while (hi - lo > 1 ) {
    int middle = (lo + hi) / 2;
    if (arr[middle] > v) {
      lo = middle;
    } else {
      hi = middle;
    }
  }

  if (v == arr[hi]) {
    return hi;
  } else {
    return -1;
  }
}

The key points are:

The interval (lo, hi] is exclusive to the left, inclusive to the right.
At each step we throw away one half of the interval. We stop when we are down to one element. Attempts to terminate early offer only a minimal performance boost, while they often affect code legibility and/or introduce bugs.
When arr[middle] = v we assign hi = middle, thus throwing away the right half. This is safe to do because we don't care any occurrences of v past middle. We do care about arr[middle], which may or may not be the first occurrence, and it is for this reason that we made (lo, hi] inclusive to the right. If there are occurrences of v before middle, we will find them in subsequent iterations.
As a side note, the more natural definition [0, n) inclusive to the left, exclusive to the right, can be used to find the last occurrence of v.

In my experience, this inclusive-exclusive interval definition produces the shortest, clearest and most versatile code. People keep trying to improve on it, but they often get tangled up in corner cases.

Issues with Binary Search Algorithm

Answers (1)

Related Questions