Quiti
Quiti

Reputation: 132

How many contiguous subarrays with max. n unique numbers

I found a programming challenge online and was wondering if there is a more efficient solution to this.

The problem: You are given a list of n numbers along with a number X which refers to the maximum number of different numbers that can be contained in a contiguous sub-array. We need to count all such contiguous sub-arrays which satisfy the condition imposed by X.

Input On the first row are two numbers n and x; the amount of numbers and the maximum number of unique numbers in the subarray.

Example:

5 2
1 2 3 1 1
ans = 10
explanation: ([1],[2],[3],[1],[1],[1,2],[2,3],[3,1],[1,1],[3,1,1])

My approach Loop through all subarrays of the list using two loops and count the number of unique numbers in the concerned subarray (using a set). Surely, there must be a more efficient way to calculate this? Sorry if this question doesn't belong here, feel free to edit it.

EDIT: nellex's corrected code that sometimes gives the wrong answer

int main() {
    int n, x;
    cin >> n >> x;

    vector<int> a;
    for (int i = 1; i <= n; i++) {
        int b;
        cin >> b;
        a.push_back(b);
    }

    int ans = 0, end = 1;
    set<int> uniq;
    map<int, int> freq;
    for (int start = 0; start < n; start++) {
        cout << start << " and end=" << end << endl;
        while (uniq.size() <= x && end < n) {
            if (uniq.size() == x && freq[a[end]] == 0) {
                break;
            }
            uniq.insert(a[end]);
            freq[a[end]]++;
            end++;
        }
        cout << "added " << end << " - " << start << " to ans" << endl;
        ans += end - start;
        freq[a[start]]--;
        if (freq[a[start]] == 0) {
            uniq.erase(a[start]);
        }
    }
    cout << ans;
}

EDIT: 1st test cases constraints:

1≤k≤n≤100

1≤xi≤10

The largest constraints:

1≤k≤n≤5⋅10^5

1≤xi≤10^9

Upvotes: 3

Views: 1228

Answers (2)

גלעד ברקן
גלעד ברקן

Reputation: 23955

We can solve this in O(n) time by keeping two pointers, p_l and p_r, both of which advance up the array, while updating a frequency count, h[e], for each element we encounter as well as the current number of unique items, k.

For example:

5 2
1 2 3 1 1

Let's look at each iteration

k = 0
h = {}
total = 0
p_r = -1
p_l = -1

1:   p_r = 0
     h = {1:1}
     k = 1
     total = 1

2:   p_r = 1
     h = {1:1, 2:1}
     k = 2
     total = 1 + 2 = 3

3:   p_r = 2
     h = {1:1, 2:1, 3:1}
     k = 3

  => move p_l until k equals X:
     p_l = 0
     h = {1:1-1=0, 2:1, 3:1}
     k = 3 - 1 = 2

     total = 3 + 2 = 5

1:   p_r = 3
     h = {1:1, 2:1, 3:1}
     k = 3

  => move p_l until k equals X:
     p_l = 1
     h = {1:1, 2:1-1=0, 3:1}
     k = 3 - 1 = 2

     total = 5 + 2 = 7

1:   p_r = 4
     h = {1:2, 2:0, 3:1}
     k = 2
     total = 7 + 3 = 10

Upvotes: 1

Nilesh
Nilesh

Reputation: 1343

A sliding window approach will fit as a better solution to this problem which will enable us to solve it in O(n*log(n)) by using a Set and a Map: https://ideone.com/v2CdZO

int main() {
    int n, x;
    cin >> n >> x;

    vector<int> a(n);
    for(int i = 0; i < n; i++) cin >> a[i];

    int end = 0;
    long long ans = 0;

    set<int> uniq;
    map<int, int> freq;
    for(int start = 0; start < n; start++) {
        while(uniq.size() <= x && end < n) {
            if(uniq.size() == x && freq[a[end]] == 0) {
                break;
            }
            uniq.insert(a[end]);
            freq[a[end]]++;
            end++;
        }
        ans += end - start;
        freq[a[start]]--;
        if(freq[a[start]] == 0) {
            uniq.erase(a[start]);
        }
    }
    cout << ans;
}

The algorithm works in the manner that for every element defined by the index start that is, a[start], we try to find the largest sub-array starting at start such that the unique elements in the sub-array is <= x. If the size of the identified sub-array is S, then we know that the element a[start] will be a part of S sub-arrays starting at index start.

If we do a dry run for the given example,

  • when start = 1, we'll generate the sub-arrays {[1], [1, 2]}
  • when start = 2, we'll generate the sub-arrays {[2], [2, 3]}
  • when start = 3, we'll generate the sub-arrays {[3], [3, 1], [3, 1, 1]}
  • when start = 4, we'll generate the sub-arrays {[1], [1, 1]}
  • when start = 5, we'll generate the sub-arrays {[1]}

Upvotes: 2

Related Questions