Reputation: 63
After having searched the internet I was not able to satisfy myself that I had found a comprehensive set of situations in which a linear search would be preferable to a binary search.
I am essentially wondering whether it would be possible to compile a relatively definitive list of advice (from the point of view of general programming as one might find in industry). Alternatively I would much appreciate it if it could be verified that indeed I have seen all there is to say on the subject.
Upvotes: 5
Views: 11747
Reputation: 133950
You probably can't come up with a definitive list. For example, I did some tests a while back with searching sorted lists in .NET. With a sorted list of integers, binary search turned out to be faster than sequential search when the number of items was 13. With a sorted list of strings, that number was 8. With other types for which comparison was more expensive, the number was even smaller.
Running the same test using a different language or runtime library will give you different numbers. It could even depend on the memory access hardware and probably some other hardware considerations.
The conventional wisdom was (perhaps still is) that sequential search was so much simpler than binary search that the reduced complexity gave it a large advantage on small lists. The truth today is that CPU speeds and memory access are so fast that the simplicity of sequential search is a factor only when the lists are very small.
At best you can come up with a definitive set of rules that apply to one runtime configuration on specific hardware when comparing a specific data type. If you change environments or change data types, you have to write tests to benchmark it all over again.
Upvotes: 1
Reputation: 63
My list of reasons for choosing a linear search over a binary search are as follows:
The list is unsorted and is only to be searched once
The list is small (though this itself is a vague notion - I've read less than around 100 elements?)
The list will need sorting following the search operation (due to say an insertion), since the resorting will dominate the time complexity of the overall task
The data structure is not random access (like a linked-list)
There is no knowledge of the data that could aid searching (relative proximities?)
Upvotes: 1