Reputation: 523
Here are two implementation of "forgetful" binary search, since they don't check for the exact match until they're finished.
1)
int bsearch1(int A[], int n, int target)
{
int low = 0, high = n - 1, mid = 0;
while (low < high)
{
mid = (low + high) >> 1;
if (target > A[mid])
low = mid + 1;
else
high = mid;
}
if (low == high)
{
if (A[low] == target)
return low;
}
return -1;
}
2)
int bsearch2(int A[], int n, int target)
{
int low = 0, high = n - 1, mid = 0;
while (low < high)
{
mid = (low + high) >> 1;
if (target < A[mid])
high = mid - 1;
else
low = mid;
}
if (low == high)
{
if (A[low] == target)
return low;
}
return -1;
}
NOTES: n
is the length of array A, target
is the element to find.
bsearch1
works fine, but bsearch2
runs into infinite loop, e.g A=[1,3,5,6], target=5. The diff between them is the conditional statement in while loop, the one in bsearch2
is just the opposite of bsearch1
. Both are perfectly right in logic. How can I know in advance bsearch2
is "wrong"? Can any one prove that the conditional statement in bsearch2
will lead to infinite loop (maybe in a mathematic view)? I can't find any clues and evidences until now.
EDIT: I have evaluated the whole process of the example A=[1,3,5,6], target=5:
1.low = 0, high = 3, mid = 1, A[mid] = 3
2.low = 1, high = 3, mid = 2, A[mid] = 5
3.low = 2, high = 3, mid = 2, A[mid] = 5
...
n.low = 2, high = 3, mid = 2, A[mid] = 5
I found that bsearch2
can not reach to low == high
this condition, thus can not exit the while loop. But I don't know why low
and high
can not reach to low == high
in the end likes bsearch1
.
Upvotes: 8
Views: 6497
Reputation: 21
So essentially it is caused by the property of the division of int type. In your case the "shift". They are in favor of the "Low" rather than the "high". This cannot be seen in the logic. The conclusion is that if you don't want to use complicated if(l== r-1) , ( ==) ? : ... thing, you always move the high rather than the low, to the mid. i.e., high = mid. Thanks for the question.
Also, in the question of "rotated sorted array with duplicates", in the corner where mid == A[high], we want to do --high. Actually we compare mid to A[high], not to A[L], because moving high is a better choice all the time.
https://leetcode.com/problems/search-in-rotated-sorted-array-ii/
bool search(vector<int>& nums, int target) {
//the key of binary search
//1. l and r have to always "squeeze" the target. That is how we move r and l, and how we consider to do r = m or r = m-1 (see 378. Kth Smallest Element in a Sorted Matrix )
// if(count < k) { l = midv+1;} else { r = midv;}
//2. Try to exclude half of the data
int l = 0, r = nums.size()-1;
int m, mval;
while(l<=r) {
m = l + ((r-l)>>1), mval = nums[m];
if (mval == target) return true;
if( mval < nums[r]) { //first see where the pivot is
if(mval < target && target <= nums[r]) l = m+1;
else r = m-1;
} else if ( mval > nums[r] ) {
if(nums[l] <= target && target < mval) r = m-1;
else l = m+1;
} else {//mval == nums[r], cannot decide pivot
--r; //move r while still be confident taht target is still in between
}
}
return false;
}
Upvotes: 0
Reputation: 66194
Your second algorithm suffers from a repeating cycle as soon as you encounter a partition where high == (low+1)
. When that happens, you essentially have mid = (low + low + 1)/2
, which is equivalent to (2*low)/2 + 1/2
. With integer division, this results in mid = low + 0
. Since your only movement on the low side is low = mid
, but they're already equivalent, you have an infinite loop.
The reason this does not happen on the first implementation is the direction of the integer division loss. It is always down. Therefore high
moving down does not suffer from this, and in-fact actually takes advantage of it.
To account for this in bsearch2
the same way bsearch1
takes advantage of the natural low-direction bias, the disgruntled rounding has to be accounted for in the mid-point calculation so it always moves in favor of the high-side. To do that, force the error out by biasing the calculation in the opposite direction. I.e. for bsearch2
, do this:
mid = (low + high + 1) >> 1;
and truth be told, to avoid overflow, this really should be
mid = low + ((high - low + 1) >> 1);
This will achieve the same effect adjusting bsearch2
midpoints that bsearch1
does. An example is worth noting:
#include <stdio.h>
int bsearch2(int A[], int n, int target)
{
int low = 0, high = n - 1, mid = 0;
while (low < high)
{
mid = low + ((high - low + 1) >> 1);
if (target < A[mid])
high = mid - 1;
else
low = mid;
}
if (low == high)
{
if (A[low] == target)
return low;
}
return -1;
}
int main()
{
// build a sorted array from 1...20
int A[20];
for (int i=0; i<sizeof(A)/sizeof(*A); ++i)
A[i] = i+1;
for (int i=0; i<=sizeof(A)/sizeof(*A)+1; ++i)
printf("Search for %d found index %d\n", i, bsearch2(A, sizeof(A)/sizeof(*A), i));
return 0;
}
Output
Search for 0 found index -1
Search for 1 found index 0
Search for 2 found index 1
Search for 3 found index 2
Search for 4 found index 3
Search for 5 found index 4
Search for 6 found index 5
Search for 7 found index 6
Search for 8 found index 7
Search for 9 found index 8
Search for 10 found index 9
Search for 11 found index 10
Search for 12 found index 11
Search for 13 found index 12
Search for 14 found index 13
Search for 15 found index 14
Search for 16 found index 15
Search for 17 found index 16
Search for 18 found index 17
Search for 19 found index 18
Search for 20 found index 19
Search for 21 found index -1
I hope that made sense.
Upvotes: 18
Reputation: 2184
First of all, run the second algorithm, wait until it stuck in infinite loop. Then put a break point and see current values like mid
, high
and low
.
I think that mid
exceeds low and takes negative value. So try this code:
high = mid - 1 >= 0 ? mid - 1 : 0;
Upvotes: 0
Reputation: 925
When you enter the loop second time, low is 1, high is 3 and so mid is 2.
In the while loop, there is no 'equal' check so what is happening is that everytime target (5) is actually equal to A[mid], so you're stuck in while loop.
while entering into while add a not equal to target check
while (low<high && A[mid]!=target) {
…
}
It should work.
Upvotes: 0