Luke Kelly
Luke Kelly

Reputation: 443

Issues with Binary Search Algorithm

I have an assignment to write a binary search that returns the first iteration of the value we are looking for. I've been doing some research online and my search looks a lot like what i'm finding but i'm having an issue. If I pass this code an array that looks like this {10,5,5,3,2} it find the 5 at in the middle(The first thing it checks) and then just returns it. But that is not the first iteration of the 5 it is the second. What am I doing wrong? Is this even possible?

Thanks in advance!

The code(I'm using Java):

public static int binarySearch(int[] arr, int v){
    int lo = 0;
    int hi = arr.length-1;
    while(lo <= hi){
        int middle = (lo+hi)/2;
        if(v == arr[middle]){
            return middle;
        }
        else
        {
            if(v < arr[middle]){
                lo = middle+1;
            }  
            else
            {
                hi = middle-1;
            }
        }
    }
    return -1;
}

Upvotes: 1

Views: 300

Answers (1)

Cătălin Fr&#226;ncu
Cătălin Fr&#226;ncu

Reputation: 1199

Here is a modified algorithm that works.

public static int binarySearch(int[] arr, int v) {
  int lo = -1;
  int hi = arr.length - 1;

  while (hi - lo > 1 ) {
    int middle = (lo + hi) / 2;
    if (arr[middle] > v) {
      lo = middle;
    } else {
      hi = middle;
    }
  }

  if (v == arr[hi]) {
    return hi;
  } else {
    return -1;
  }
}

The key points are:

  • The interval (lo, hi] is exclusive to the left, inclusive to the right.
  • At each step we throw away one half of the interval. We stop when we are down to one element. Attempts to terminate early offer only a minimal performance boost, while they often affect code legibility and/or introduce bugs.
  • When arr[middle] = v we assign hi = middle, thus throwing away the right half. This is safe to do because we don't care any occurrences of v past middle. We do care about arr[middle], which may or may not be the first occurrence, and it is for this reason that we made (lo, hi] inclusive to the right. If there are occurrences of v before middle, we will find them in subsequent iterations.
  • As a side note, the more natural definition [0, n) inclusive to the left, exclusive to the right, can be used to find the last occurrence of v.

In my experience, this inclusive-exclusive interval definition produces the shortest, clearest and most versatile code. People keep trying to improve on it, but they often get tangled up in corner cases.

Upvotes: 1

Related Questions