Reputation: 62155
In Matters Computational I found this interesting linear search implementation (it's actually my Java implementation ;-)):
public static int linearSearch(int[] a, int key) {
int high = a.length - 1;
int tmp = a[high];
// put a sentinel at the end of the array
a[high] = key;
int i = 0;
while (a[i] != key) {
i++;
}
// restore original value
a[high] = tmp;
if (i == high && key != tmp) {
return NOT_CONTAINED;
}
return i;
}
It basically uses a sentinel, which is the searched for value, so that you always find the value and don't have to check for array boundaries. The last element is stored in a temp variable, and then the sentinel is placed at the last position. When the value is found (remember, it is always found due to the sentinel), the original element is restored and the index is checked if it represents the last index and is unequal to the searched for value. If that's the case, -1 (NOT_CONTAINED) is returned, otherwise the index.
While I found this implementation really clever, I wonder if it is actually useful. For small arrays, it seems to be always slower, and for large arrays it only seems to be faster when the value is not found. Any ideas?
EDIT
The original implementation was written in C++, so that could make a difference.
Upvotes: 1
Views: 1075
Reputation:
Yes - it does because while loop doesn't have 2 comparisons as opposed to standard search.
It is twice as fast.It is given as optimization in Knuth Vol 3.
Upvotes: 0
Reputation: 95334
A sentinel search goes back to Knuth. It value is that it reduces the number of tests in a loop from two ("does the key match? Am I at the end?") to just one.
Yes, its useful, in the sense that it should significantly reduce search times for modest size unordered arrays, by virtue of eliminating conditional branch mispredictions. This also reduces insertion times (code not shown by the OP) for such arrays, because you don't have to order the items.
If you have larger arrays of ordered items, a binary search will be faster, at the cost of larger insertion time to ensure the array is ordered.
For even larger sets, a hash table will be the fastest.
The real question is what is the distribution of sizes of your arrays?
Upvotes: 2
Reputation: 37007
It's not thread-safe, for example, you can lose your a[high]
value through having a second thread start after the first has changed a[high]
to key
, so will record key
to tmp
, and finish after the first thread has restored a[high]
to its original value. The second thread will restore a[high]
to what it first saw, which was the first thread's key
.
It's also not useful in java, since the JVM will include bounds checks on your array, so your while loop is checking that you're not going past the end of your array anyway.
Upvotes: 10
Reputation: 34391
Will you ever notice any speed increase from this? No
Will you notice a lack of readability? Yes
Will you notice an unnecessary array mutation that might cause concurrency issues? Yes
Premature optimization is the root of all evil.
Upvotes: 7
Reputation: 96109
See also the 'finding a tiger in africa' joke.
Punchline = An experienced programmer places a tiger in cairo so that the search terminates.
Upvotes: 2
Reputation: 23265
Doesn't seem particularly useful. The "innovation" here is just to get rid of the iteration test by combining it with the match test. Modern processors spend 0 time on iteration checks these days (all the computation and branching gets done in parallel with the match test code).
In any case, binary search kicks the ass of this code on large arrays, and is comparable on small arrays. Linear search is so 1960s.
Upvotes: 5