computational complexity of the following algorithm?

Question

I have an array arr of size n containing the integers 1 through n in a random order; for example, it might be arr = [4 5 3 6 2 1].

For each element e in arr, I need to find the closest element k that is less than e. ("Closest" in terms of position, not value; so in the above example, 5 is closer to 3 than to 6.) If there are equally-close lesser elements in both directions — as with 6 in the example above, whose immediate neighbors are both less than it — then it doesn't matter which of those elements is chosen.

So with the example above, arr = [4 5 3 6 2 1], one valid result is [3 3 2 2 1 NIL].

I know of a O(n) algorithm to solve this; it works by maintaining a stack to find the nearest smaller element in left and right of every element.

But surprisingly, a simple linear bidirectional search performed better than that algorithm, even though it seems to me to be O(n²) in the worst case:

for i = 1 to N
  right = i + 1
  left = i - 1
  while (left > 0 && right <= n)
    // ...find the closest element that is smaller than arr[i] and break while loop

What is the worst-case complexity of the above? Am I right that it's O(n²)?

ruakh · Accepted Answer

It is worst-case Θ(n log n) time (assuming you implement the fix that Daniel mentions above — you need to make sure that you can scan all the way to the end in each direction).

First, observe that if two elements a and b are neighbors, then either a < b or b < a. In either case, at least one of these elements will require a "search distance" of only 1.

By extension, at least half the elements of the array will require a search distance of only 1.

Similarly, if four elements [a, b, c, d] are all in a row, then one of these is less than the other three. So at least three of these elements will require a search distance of at most 3. And as noted above, two of them actually require a search distance of only 1, so really this means that two have a search distance of 1 and a third has a search distance of at most 3. The total search distance is then at most 2·1 + 1·3 + (n−1).

We can continue this logic for increasing runs of length 2^k: we have at most 2^k−1·(2¹−1) + 2^k−2·(2²−1) + 2^k−3·(2³−1) + ⋯ + (n−1).

If n is a power of 2, then we can write it as 2^k, and the total search distance is at most:
2^k−1·(2¹−1) + 2^k−2·(2²−1) + 2^k−3·(2³−1) + ⋯ + 2⁰·(2^k−1) =
= (2^k−2^k−1) + (2^k−2^k−2) + (2^k−2^k−3) + ⋯ + (2^k−2⁰)
= k·2^k − 2^k + 1
= n log n − log n + 1
< n log n

(If n is not a power of 2, then we can obtain a somewhat larger array by appending the values n+1, n+2, etc., until we do have a power of 2, and observe that the resulting total search distance is less than 2n log 2n, which is still in Θ(n log n).)

Of course, the above just sets an upper bound (O rather than Θ); but I think it shows how to actually achieve the worst-case: we want the least numbers spread as far apart as possible, in a powers-of-two fashion. For example, we can "expand" a list like this:

1       2
1   3   2   4
1 5 3 6 2 7 4 8

The total search distance for the above is 7 + 1 + 2 + 1 + 4 + 1 + 2 + 1 = 17 = 3·8 − 8 + 1, as predicted.

computational complexity of the following algorithm?

Answers (1)

Related Questions