Reputation: 19578
Binary search can be implemented in many ways-recursive, iterative, conditionals, etc. I took this from Bentley's book "Programming pearls: Writing correct programs" which is an iterative implementation, and that includes a bug.
public class BinSearch
{
static int search( int [] A, int K ) {
int l = 0;
int u = A. length -1;
int m;
while ( l <= u ) {
m = (l+u) /2;
if (A[m] < K){
l = m + 1;
} else if (A[m] == K){
return m;
} else {
u = m-1;
}
}
return -1;
}
}
I found a bug in the line m = (l+u) /2; it can lead to overflow. How can we avoid this overflow in this binary search?
Upvotes: 16
Views: 4677
Reputation: 1580
Although Machtl and others have already written the way to overcome the overflow bug but adding just for the sake of completeness
Upvotes: -1
Reputation: 73608
The iterative implementation binary search indeed has a bug. change m = (l+u)/2
. As others have mentioned it can lead to integer overflow. Replace that by m = l + (u-l)/2
.
From experience I have time and again seen buggy binary search implementations. Binary search although a simple concept involving divide and conquer can be difficult to get right. Its easy to change the above m
assignment. Hope this helps...
Upvotes: 1
Reputation: 46960
As several others have said, the fix is simple, certainly the simplest I've seen for a 100-point bounty! Here is another, which has a nice symmetry, even if it takes a few more clock cycles:
m = (l >> 1) + (u >> 1) + (l & u & 1);
You should not malign Bentley for "a bug" until you have better information. When he wrote that article for the ACM (some time in the 1980's I think), he was pseudocoding and writing in 32-bit C; machines with gigabytes of RAM didn't exist. Even if they had, with 4-byte ints, a 32-bit machine can't have an array with more than 2^28 ints. Therefore the highest possible index is 2^28-1. Doubling this value does not cause an in int
to overflow.
Of course, it's exactly the same with 32-bit Java. You need the broken kludge of 64-bit Java - a language that allows objects of size approaching 2^64 but limits indices to 2^32-1 in order to cause this "bug" to appear.
What you call a bug is a change of operating assumptions. Every program in the universe will manifest some kind of flaw if the environment changes in the right way.
Upvotes: 5
Reputation: 171
Suppose l and u are int both fall into [0, 2^31-1]. If l, u >= 2^30, then (l+u) >= 2^31 is overflow. To avoid this, use
m = l + (u-l)/2;
instead. What is more, it may be more reasonable to write binary search like this:
public class BinSearch
{
static int search( int [] A, int K ) {
int l = -1; // index lower bound shift left by 1
int u = A.length; // index upper bound shift right by 1
int m;
while ( l + 1 < u ) {
m = l + (u-l)/2; // avoid overflow
if (A[m] < K){
l = m; // keep A[l] < K
} else {
u = m; // keep A[u] >= K
}
}
if ( (u == A.length) || (A[u] != K) ) return -1;
return u;
}
}
Upvotes: 6
Reputation: 984
Try the following:
change
m = (l+u) /2
to
m = (u-l) / 2 + l
The reason why the (l+u) / 2
can overflow becomes obvious if you consider a very large array of 2^31 - 1 elements (maximum value a signed 32-bit integer can hold).
In this case the first iteration is just fine because 2^31 - 1 + 0
is not a big deal but consider the case of l = m + 1
here. In the second iteration u is still the same and l is 2^31 / 2
so l + u
will lead to an overflow.
This way we are avoiding the addition of u + l
by first determining the relative middle between l and u (u - l) / 2
and then adding the lower number l to it so it becomes absolute. So during the operation m = (u-l) / 2 + l;
we never exceed the value of u.
To summarize the complete code:
public class BinSearch
{
static int search( int [] A, int K )
{
int l = 0;
int u = A. length -1;
int m;
while ( l <= u )
{
m = (u-l) / 2 + l;
if (A[m] < K)
l = m + 1;
else if (A[m] == K)
return m;
else
u = m - 1;
}
return -1;
}
}
Upvotes: 16
Reputation: 410
Try changing
m = (l+u) / 2
to
m = l + (u - l) / 2
It is trivial to see that both is equal and also the second statement prevents the overflow.
Upvotes: 2
Reputation: 21
I think you should change u = m-l; to u = m -1.
it's 1 not l. ---------addtion------ cause (l+u) may be greater than 2^31-1 (if the int is 32bits), so it may overflow. so you should change (l+u)/2 to l+((u-l)>>1), and u-l cannot be greater than 2^31-1.
Upvotes: 1