user1886376
user1886376

Reputation:

Find duplicates in array when there is more than one duplicated element

How can I find duplicates in array when there is more than one duplicated element?

When the array is only one duplicated element (for example: 1, 2, 3, 4, 4, 4, 5, 6, 7) then it is very easy:

int duplicate(int* a, int s)
{ 
    int x = a[0];
    for(int i = 1; i < s; ++i)
    {
        x = x ^ a[i];
    }
    for(int i = 0; i < a[s]; ++i)
    {
        x = x ^ i;
    }
    return x;
}

But if the input array contains more than one duplicated element (for example: 1, 2, 2, 2, 3, 4, 4, 4, 5, 6, 7), the above won't work. How can we solve this problem in O(n) time?

Upvotes: 0

Views: 1462

Answers (2)

Jean
Jean

Reputation: 483

Using a set is one of the possible generic solutions. Example in c++:

template <typename T>
void filter_duplicates(T* arr, int length) {
    std::unordered_set<T> set;
    for (int i = 0; i < length; ++i) {
        if (set.count(arr[i]) > 0) {
            // then it's a duplicate
        }
        set.insert(arr[i]);
    }
    // the set contains all the items, unduplicated
}

As unordered_set is implemented as a hash table, insertion and lookup are of amortized constant complexity. As a set can only contain unique keys, this effectively de-duplicates the items. We could finally convert back the set to an array. We could also use a map to count the occurrences.

If array elements are integers and that the maximum possible value is known, and fairly low, then the set can be replaced by a simple array either 1. of boolean or 2. of integer if we want to count the number of occurrences.

Upvotes: 1

MrSmith42
MrSmith42

Reputation: 10161

If space is no concern or the maximal number is quite low, you can simple use a kind of a bit-array and mark all already occurred numbers by setting the bit at the position of the number.

It'a a kind of HashSet with trivial (identity) hash-function. Tests and set cost O(1) time.

Upvotes: 1

Related Questions