Reputation: 3907
Let's say I have two arrays:
int ArrayA[] = {5, 17, 150, 230, 285};
int ArrayB[] = {7, 11, 57, 110, 230, 250};
Both arrays are sorted and can be any size. I am looking for an efficient algorithm to find if the arrays contain any duplicated elements between them. I just want a true/false answer, I don't care which element is shared or how many.
The naive solution is to loop through each item in ArrayA, and do a binary search for it in ArrayB. I believe this complexity is O(m * log n).
Because both arrays are sorted, it seems like there should be a more efficient algorithm.
I would also like a generic solution that doesn't assume that the arrays hold numbers (i.e. the solution should also work for strings). However, the comparison operators are well defined and both arrays are sorted from least to greatest.
Upvotes: 14
Views: 7803
Reputation:
If one list is much much shorter than the other, binary search is the way to go. If the lists are of similar length and you're happy with O(m+n), a standard "merge" would work. There are fancier algorithms that are more flexible. One paper I've come across in my own searches is:
http://www.cs.uwaterloo.ca/~ajsaling/papers/paper-spire.pdf
Upvotes: 4
Reputation: 21597
Since someone wondered about stl. Out-of-the-box, the set_intersection algorithm would do more than you want: it would find all the common values.
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
// ...
int ArrayA[] = {5, 17, 150, 230, 285};
int ArrayB[] = {7, 11, 57, 110, 230, 250};
vector<int> intersection;
ThrowWhenWritten output_iterator;
set_intersection(ArrayA, ArrayA + sizeof(ArrayA)/sizeof(int),
ArrayB, ArrayB + sizeof(ArrayB)/sizeof(int),
back_insert_iterator<vector<int> >(intersection));
return !intersection.empty();
this runs in O(m+n) time, but it requires storing all the duplicates and doesn't stop when it finds the first dup.
Now, modifying the code from the gnu implementation of the stl, we can get more precisely what you want.
template<typename InputIterator1, typename InputIterator2>
bool
has_intersection(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2)
{
while (first1 != last1 && first2 != last2)
{
if (*first1 < *first2)
++first1;
else if (*first2 < *first1)
++first2;
else
return true;
}
return false;
}
Upvotes: 8
Reputation: 755457
If you are using C# 3.0 then why not take advantage of LINQ here?
ArrayA.Intersect(ArrayB).Any()
Not only is this generic (works for any comparable type) the implementation under the hood is pretty efficient (uses a hashing algorithm).
Upvotes: 1
Reputation: 103575
Glomek is on the right track, but kinda glossed over the algorithm.
Start by comparing ArrayA[0] to ArrayB[0]. if they are equal, you're done. If ArrayA[0] is less than ArrayB[0], then move to ArrayA[1]. If ArrayA[0] is more than ArrayB[0], then move to ArrayB[1].
Keeping stepping through till you reach the end of one array or find a match.
Upvotes: 0
Reputation: 111986
If the range of values is small, you could build a lookup table for one of them (time cost = O(N)) and then check if the bit is set from the other list (time cost = O(N)). If the range is large, you could do something similar with a hash table.
The mergesort trick from Glomek is an even better idea.
Upvotes: 0
Reputation: 124044
If you don't care about memory consumption, you can achieve good performance by using hash, i.e. create hash with keys = values of one array, and test values of second array against this hash
Upvotes: 3
Reputation: 25338
Pretend that you are doing a mergesort, but don't send the results anywhere. If you get to the end of either source, there is no intersection. Each time you compare the next element of each, if they are equal, there is an intersection.
For example:
counterA = 0;
counterB = 0;
for(;;) {
if(counterA == ArrayA.length || counterB == ArrayB.length)
return false;
else if(ArrayA[counterA] == ArrayB[counterB])
return true;
else if(ArrayA[counterA] < ArrayB[counterB])
counterA++;
else if(ArrayA[counterA] > ArrayB[counterB])
counterB++;
else
halt_and_catch_fire();
}
Upvotes: 40