Reputation: 11031
I have a simple std::vector
containing some numbers, which are sorted (in ascending order). I want to lookup an element, so far I use:
return std::lower_bound(vec.begin(), vec.end(), needle);
Where needle
is the element I look for. However, my vector tends to be quite long (millions of elements), but most of the time the contents are relatively predictable in a sense that if the first element is zero and the last element is N
, then the elements in between have value close to (N * index) / vec.size()
and are hence predictable.
Is there a modification of the lower bound, which would accept a hint (similarly to how std::map::emplace_hint()
does), such as:
assert(!vec.empty());
std::vector<int>::iterator hint = vec.begin() + std::min(vec.size() - 1,
(needle * vec.size()) / vec.back());
if(*hint > needle)
return std::lower_bound(vec.begin(), hint, needle);
else
return std::lower_bound(hint, vec.end(), needle);
This will work, but the lower_bound
ignores that it is close to the solution and will most likely start splitting the interval to halves (looking where we know that the needle most likely isn't), taking unnecessarily many steps. I know that there was an algorithm which starts with step 1, which it doubles until it overshoots the needle, and then does binary search in the given interval.
I forgot what is the name of the algorithm. Is it implemented in the STL?
Upvotes: 22
Views: 2545
Reputation: 372784
I think the algorithm you're looking for is called interpolation search which is a variation on binary search that, instead of looking at the midpoint of the array, linearly interpolates between the array endpoints to guess where the key should be. On data that's structured the way that yours is, the expected runtime is O(log log n), exponentially faster than a standard binary search.
There is no standard implementation of this algorithm in C++, but (as a totally shameless plug) I happened to have coded this one up in C++. My implementation is available online if you're interested in seeing how it works.
Hope this helps!
Upvotes: 28