user2784016
user2784016

Reputation: 179

Maxima counting

Let A = {4, 1, 3, 2}

There are 10 contiguous sub-arrays. Let me consider the maximum of all those sub-arrays.

  Sub array   Max 

{4}            4
{1}            1
{3}            3
{2}            2
{4, 1}         4
{1, 3}         3
{3, 2}         3
{4, 1, 3}      4
{1, 3, 2}      3
{4, 1, 3, 2}   4

So:

Max #    Occurrences
 1           1
 2           1
 3           4
 4           4

Question: Is there any method for counting the number of such occurrences in O(n) time? Using dequeue I can find the maximum for all the intervals, but it will take O(n^2 log n) time, which is not feasible for large size array.

Upvotes: 1

Views: 656

Answers (1)

skyking
skyking

Reputation: 14390

The algorithm would be to first find the global maximum(s), these are maximum on all the intervals they're in (counting them should be O(N), multiple global maximas will complicate the complexity a bit though). Then you know that the other maximas will not be maximum for an intervall that extends beyond or including the global maximum, so you can then split down the problem into subarrays.

For example with your array you would get first to observe that 4 is the maximum of the array, it is consequently the maximum of all sub-arrays starting at or before 0 and ending at or after 0, that is 4. Then you get the preceding and succeding subarrays {} and {1, 3, 2}. Then you go on by checking {1, 3, 2} (we skip the empty one as it is trivial) there 3 is the maximum and it's the maximum of all subarrays where it's in, namely those which start at or before 1 and end at or after 1, there is four such (2*2), then it continues with the subarrays {1} and {2}.

The complication where we run into multiple maximas have to be considered also. For example if we're at the array {1, 4, 3, 4, 3, 2} you'll find two maximas (two fours), the same applies here the first 4 is maximum for all intervals where it belongs (which here is 10) and the second is maximum in 12, but beware that some of these intervals are the same, you have to decide how to count them (see below). Now the same applies that the other numbers can only be maxima in a subarray where 4 is not in, so we end up with 3 subarrays to consider, namely {1}, {3} and {3, 2}.

When you have multiple maximas in an interval it will not be as non-ambiguos how you want to count them. For example in the interval {4, 3, 4}, the 4 is maximum in the intervals {4} (the first), {4} (the second), {4, 3}, {3, 4} and {4, 3, 4}. All but the latest is quite straight forward, there's one, but how do you want to count the last? Well it's one interval so it's one, but you could also decide to count it twice. When analyzing the array according to the algorithm you will end up in having the first one being maximum in 3 intervals, and the second in 3 intervals which adds up to 6, which corresponds to the case where you count the last interval twice (because it has two 4s).

If you on the other hand want to count the interval only once you could for example use subarrays for counting them. You first construct the maximal subinterval containing each of them alone and the one that contains both. For example in the array {1, 4, 3, 4, 3, 2} you have first the subarrays starting before or at the first 4 and ends at or after the first 4, but before the second (this means you get 4 interval), then you do the corresponding for the second which means intervals starting at or before the second 4 but after the first and ends at or after the second 4 (this means you get 6 intervals), at last you count them containing both that is the interval that starts at or before the first 4 and ends at or after the second (this gives 6 intervals). When we sum that up we end up with 16 intervals.

Another solution could be to pick one of the 4s to be considered the maximum and then handle the remaining 4s in respective subarray. In this example one would maybe choose the second because it's the most middle 4. That would result in counting 12 intervals and then when you handle the left subinterval ({1, 4, 3}) you will there find that that 4 is the maximum in 4 intervals. It will also add up to 16 intervals where 4 is the maximum.

The algorithm does this in O(N log N) time on average and in O(N^2) time in worst case scenario if you're using naive approach to finding the maximum in the subarrays (which is O(N)). However using segment trees you should be able to reduce the maxima finding in the subarray to a O(log N) operation after the (one time) construction of the segment tree (which is O(N)). After the construction of the segment tree the worst case would be O(N log N) and on average O((log N)^2). In the worst case scenario the counting dominates the complexity so it's O(N log N) and on average the building of the segment tree dominates so it's O(N).

Upvotes: 3

Related Questions