helpermethod
helpermethod

Reputation: 62155

Is this linear search implementation actually useful?

In Matters Computational I found this interesting linear search implementation (it's actually my Java implementation ;-)):

public static int linearSearch(int[] a, int key) {
    int high = a.length - 1;
    int tmp  = a[high];

    // put a sentinel at the end of the array   
    a[high] = key;

    int i = 0;

    while (a[i] != key) {
        i++;
    }

    // restore original value
    a[high] = tmp;

    if (i == high && key != tmp) {
        return NOT_CONTAINED;
    }

    return i;
}

It basically uses a sentinel, which is the searched for value, so that you always find the value and don't have to check for array boundaries. The last element is stored in a temp variable, and then the sentinel is placed at the last position. When the value is found (remember, it is always found due to the sentinel), the original element is restored and the index is checked if it represents the last index and is unequal to the searched for value. If that's the case, -1 (NOT_CONTAINED) is returned, otherwise the index.

While I found this implementation really clever, I wonder if it is actually useful. For small arrays, it seems to be always slower, and for large arrays it only seems to be faster when the value is not found. Any ideas?

EDIT

The original implementation was written in C++, so that could make a difference.

Upvotes: 1

Views: 1075

Answers (6)

user256717
user256717

Reputation:

Yes - it does because while loop doesn't have 2 comparisons as opposed to standard search.

It is twice as fast.It is given as optimization in Knuth Vol 3.

Upvotes: 0

Ira Baxter
Ira Baxter

Reputation: 95334

A sentinel search goes back to Knuth. It value is that it reduces the number of tests in a loop from two ("does the key match? Am I at the end?") to just one.

Yes, its useful, in the sense that it should significantly reduce search times for modest size unordered arrays, by virtue of eliminating conditional branch mispredictions. This also reduces insertion times (code not shown by the OP) for such arrays, because you don't have to order the items.

If you have larger arrays of ordered items, a binary search will be faster, at the cost of larger insertion time to ensure the array is ordered.

For even larger sets, a hash table will be the fastest.

The real question is what is the distribution of sizes of your arrays?

Upvotes: 2

Stephen Denne
Stephen Denne

Reputation: 37007

It's not thread-safe, for example, you can lose your a[high] value through having a second thread start after the first has changed a[high] to key, so will record key to tmp, and finish after the first thread has restored a[high] to its original value. The second thread will restore a[high] to what it first saw, which was the first thread's key.

It's also not useful in java, since the JVM will include bounds checks on your array, so your while loop is checking that you're not going past the end of your array anyway.

Upvotes: 10

erikkallen
erikkallen

Reputation: 34391

Will you ever notice any speed increase from this? No

Will you notice a lack of readability? Yes

Will you notice an unnecessary array mutation that might cause concurrency issues? Yes

Premature optimization is the root of all evil.

Upvotes: 7

Martin Beckett
Martin Beckett

Reputation: 96109

See also the 'finding a tiger in africa' joke.
Punchline = An experienced programmer places a tiger in cairo so that the search terminates.

Upvotes: 2

Keith Randall
Keith Randall

Reputation: 23265

Doesn't seem particularly useful. The "innovation" here is just to get rid of the iteration test by combining it with the match test. Modern processors spend 0 time on iteration checks these days (all the computation and branching gets done in parallel with the match test code).

In any case, binary search kicks the ass of this code on large arrays, and is comparable on small arrays. Linear search is so 1960s.

Upvotes: 5

Related Questions