Reputation: 35982

Intersect of two sorted arrays

Question> Given two sorted array A and B, return an array C containing elements common to A and B. The array C cannot contain duplicates.

Here is my solution, but my hunch is that it is wrong. But I cannot find a counter-example to dis-approve it. Can someone point it out for me? Or give me a counter-example?

Update:

The algorithm works as follows:

We hold one pointer to each array and move these pointers forward until we find a common element. Then if the common element is NOT in C, the found element is stored in C. Otherwise, depend on the element, we move the pointer forward accordingly.

#include <iostream>
#include <vector>
#include <random>
#include <iterator>
#include <algorithm>
using namespace std;

vector<int> Intersect(const vector<int>& vecIntsA, const vector<int>& vecIntB)
{
    int indA = 0;
    int indB = 0;
    vector<int> vecIntC;

    while(indA < vecIntsA.size() && indB < vecIntB.size())
    {
        if (vecIntsA[indA] == vecIntB[indB]) {
            if ( vecIntC.empty() || vecIntC.back() != vecIntsA[indA])
                vecIntC.emplace_back(vecIntsA[indA]);
            indA++;
            indB++;
        } else if (vecIntsA[indA] < vecIntB[indB]) 
            indA++;
        else // (vecIntsA[indA] > vecIntB[indB]) 
            indB++;        
    }

    return vecIntC;
}

int main()
{
   default_random_engine dre;
   uniform_int_distribution<int> dist(0, 100);

   vector<int> vecIntA;
   for(int i=0; i < 20; ++i)
    vecIntA.emplace_back(dist(dre));   
   sort(vecIntA.begin(), vecIntA.end());
   copy(vecIntA.cbegin(), vecIntA.cend(), ostream_iterator<int>(cout, ","));
   cout << endl;

   vector<int> vecIntB;
   for(int i=0; i < 24; ++i)
    vecIntB.emplace_back(dist(dre));   
   sort(vecIntB.begin(), vecIntB.end());
   copy(vecIntB.cbegin(), vecIntB.cend(), ostream_iterator<int>(cout, ","));
   cout << endl;

   vector<int> vecIntC = Intersect(vecIntA, vecIntB);
   copy(vecIntC.cbegin(), vecIntC.cend(), ostream_iterator<int>(cout, ","));

   return 0;
}

Upvotes: 2

Answers (3)

Prateek Oraon

Reputation: 101

Here is a quick solution with time complexity (p+q), where p and q are lengths of array A and B, respectively.

#include <iostream>
#include <vector>
#include <set>
#include <algorithm>
using namespace std;

set<int> intersect(vector<int> &A, vector<int> &B) {
    int j = 0;
    vector<int> V;
    for(int i = 0;i<A.size();i++){
        first:
        if(j == B.size()) break;
        if(A[i] == B[j]){
            V.push_back(A[i]);
            j++;
        }
        else if(A[i]>B[j]) { j++;goto first;}
    }
    set<int> S(V.begin(), V.end());
    return S;
}

int main() {
    vector<int> A,B;
    A = {1,2,3,3,4,5,6};
    B = {3,3,5,6};
    set<int> S;
    S = intersect(A,B);     
    set<int>::iterator iter;
    for(iter=S.begin(); iter!=S.end();++iter){
        cout<<(*iter)<<" ";
    }

    return 0;
}

This is a 2-pointer solution. Try to look for monotonicity in one of the loops as other loops move forward. If you find that, you have found your optimization. Happy coding :)

Upvotes: 1

Tom Moertel

Reputation: 73

Your algorithm seems reasonable. For what it's worth, I recently solved the exact same problem and came up with a similar algorithm for the case when the two arrays are of similar length.

In general, if you want to support the belief your algorithms produce good solutions, express the fundamental properties of good solutions in a way that you can check automatically. Then write automated tests against those properties. (This is a great style of testing popularized by QuickCheck.)

For this problem, for example, I expressed the fundamental property of array intersection as follows: "Given an intersection function f, for all sorted arrays A and B, f(A, B) == sorted(set(A) & set(B))." (In Python, set(xs) produces a set from the elements of xs, and the & operator applied to sets computes intersections.) In essence, I mapped the desired semantics of array intersection into Python's built-in semantics for sorting and set intersection. That way, I could build a correctness oracle for my implementation out of cheap, readily available parts. The final step was constructing random test cases and checking whether the mapping held (by consulting the oracle) for all of them.

Here's the corresponding code:

def check_function(f):
# fundamental property:
# forall sorted arrays A, B. intersect(A, B) == sorted(set(A) & set(B))
from math import factorial
from random import randrange
from nose.tools import assert_equal
for N in xrange(8):
    for _ in xrange(factorial(N)):  # get decent sample of problem space
        m, n = randrange(N + 1), randrange(N + 1)
        A = sorted(randrange(N + 1) for _ in xrange(m))
        B = sorted(randrange(N + 1) for _ in xrange(n))
        got = f(A, B)
        expected = sorted(set(A) & set(B))
        assert_equal(got, expected)

Upvotes: 0

Duncan Smith

Reputation: 538

You could always use the STL algorithms set_intersection and unique?

Upvotes: 1

Intersect of two sorted arrays

Answers (3)

Related Questions