Oscar
Oscar

Reputation: 1147

Contiguous Subarrays. Is there a better solution than O(n^2)?

I am solving the following problem:

You are given an array arr of N integers. For each index i, you are required to determine the number of contiguous subarrays that fulfills the following conditions:

  1. The value at index i must be the maximum element in the contiguous subarrays, and
  2. These contiguous subarrays must either start from or end with i.

Input: Array arr is a non-empty list of unique integers that range between 1 to 1,000,000,000, Size N is between 1 and 1,000,000

Output: An array where each index i contains an integer denoting the maximum number of contiguous subarrays of arr[i]

Example: arr = [3, 4, 1, 6, 2] output = [1, 3, 1, 5, 1]

I have the following naive O(n2) solution:

  int[] countSubarrays(int[] arr) {
  int len = arr.length;
  int[] output = new int[arr.length];
  
  for (int i = 0; i < len; i++) { 
    output[i] = 1;
    
    // move right
    for (int j = i + 1; j < len; j++) {
      if (arr[i] <= arr[j]) {
        break;
      }
      output[i]++;
    }
    
    // move left
    for (int j = i - 1; j >= 0; j--) {
      if (arr[i] <= arr[j]) {
        break;
      }
      output[i]++;
    }
  }
  
  return output;
}

According to the coding platform there is another solution:

We can next observe that the index of the latest element to the left of the ith element which is larger than it determines which subarrays ending at index i are valid - specifically, the ones beginning to the right of that larger element. Letting G[i] be equal to the largest index j such that j < i and a[j] > a[i] (or equal to 0 if there’s no such j), then L[i] = i - G[i]. We’ve now reduced the problem to computing these values G[1..N] for an array of N distinct integers.

Computing G[i] for each i from 1 to N is a promising approach, but we’ll still need to consider how to do so as efficiently as possible. We can observe that it’s not possible to compute G[i] purely based on the values of G[i-1], a[i-1], and a[i]; we may need more information about earlier a values as well, but would like to avoid simply scanning over all of them. Out of earlier indices j (such that j < i), we can consider which indices are worth considering as potential candidates for G[i] - for example, if there exists a pair of indices j and k such that j < k and a[j] < a[k], can index j ever be a candidate for G[i] for any i > k? If we can maintain information about the set of these possible candidate indices as we go through the array, it’s possible to efficiently determine the one that’s actually equal to G[i] for each i.

I am not able to get the intuition of the logic behind of this. Any suggestions?

Upvotes: 0

Views: 1525

Answers (1)

ruakh
ruakh

Reputation: 183290

So, to start with, let's simplify the problem by only considering subarrays that end with their maximum value. (Subarrays that start with their maximum value can be found using the same approach, but starting at the end of the array and working backwards instead of starting at the beginning and working forwards. Fortunately, we don't really need to worry about double-counting subarrays that start and end at equal values, because the problem specifies that all elements are distinct, so the only such subarrays are the subarrays of length 1, which we can easily handle by just subtracting n from our final answer.)

I see from your code that you've already figured out that, for any given index i, what we need to find is the greatest index j < i such that A[j] > A[i], because the subarrays [(j+1)..i], [(j+2)..i], ... [i..i] are exactly the subarrays meeting our condition. In your code, you increment output as you work your way backward to j; but you can actually just write output += i - j after the end of the loop. This means that, provided we can find j in amortized-constant time, we can solve the problem in O(n) time instead of O(n2) time.

So, how do we do that? The trick is to keep track of our previous answers in an array called G; for example, if A is [10,0,5,3], then G will be [-1,0,0,2]. (Do you see why?) Then, when we're trying to find G[i], we can consult the values we've already stored in G[0..(i−1)] to jump backward faster; instead of j--, we can write j = G[j].

For each individual index i, it's still possible that it will take up to n jumps backward to find G[i]; but that only happens if A[i] is greater than all previous values and all previous values were decreasing. That can only happen once. More generally, the total number of jumps across all values of i will be O(n), because we only make the jump from j to G[j] the first time we find a value greater than A[j]. (Do you see why?)

Upvotes: 4

Related Questions