Reputation: 1108
What is the best way to verify if there are common members within multiple vectors? The vectors aren't necessarily of equal size and they may contain custom data (such as structures containing two integers that represent a 2D coordinate).
For example:
vec1 = {(1,2); (3,1); (2,2)};
vec2 = {(3,4); (1,2)};
How to verify that both vectors have a common member?
Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.
Upvotes: 2
Views: 1755
Reputation: 444
Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.
You need to go through all elements at least once, I assume you're implying you don't want to check every combinations. Indeed you don't want to do :
for all elements in vec1, go through the entire vec2 to check if the element is here. This won't be efficient if your vectors have a big number of elements.
If you prefer a linear time solution and you don't mind using extra memory here is what you can do :
You need a hashing function to insert element in an unordered_map or unordered_set See https://stackoverflow.com/a/13486174/2502814
// next_permutation example
#include <iostream> // std::cout
#include <unordered_set> // std::unordered_set
#include <vector> // std::vector
using namespace std;
namespace std {
template <>
struct hash<pair<int, int>>
{
typedef pair<int, int> argument_type;
typedef std::size_t result_type;
result_type operator()(const pair<int, int> & t) const
{
std::hash<int> int_hash;
return int_hash(t.first + 6495227 * t.second);
}
};
}
int main () {
vector<pair<int, int>> vec1 {{1,2}, {3,1}, {2,2}};
vector<pair<int, int>> vec2 {{3,4}, {1,2}};
// Copy all elements from vec2 into an unordered_set
unordered_set<pair<int, int>> in_vec2;
in_vec2.insert(vec2.begin(),vec2.end());
// Traverse vec1 and check if elements are here
for (auto& e : vec1)
{
if(in_vec2.find(e) != in_vec2.end()) // Searching in an unordered_set is faster than going through all elements of vec2 when vec2 is big.
{
//Here are the elements in common:
cout << "{" << e.first << "," << e.second << "} is in common!" << endl;
}
}
return 0;
}
Output : {1,2} is in common!
You can either do that, or copy all elements of vec1 into an unordered_set, and then traverse vec2. Depending on the sizes of vec1 and vec2, one solution might be faster than the other. Keep in mind that picking the smaller vector to insert in the unordered_set also means you will use less extra memory.
Upvotes: 1
Reputation: 5118
For non-trivial data sets, the most efficient method is probably to sort both vectors, and then use std::set_intersection function defined in , like follows:
#include <vector>
#include <algorithm>
using namespace std;
typedef vector<pair<int, int>> tPointVector;
tPointVector vec1 {{1,2}, {3,1}, {2,2}};
tPointVector vec2 {{3,4}, {1,2}};
std::sort(begin(vec1), end(vec1));
std::sort(begin(vec2), end(vec2));
tPointVector vec3;
vec3.reserve(std::min(vec1.size(), vec2.size()));
set_intersection(begin(vec1), end(vec1), begin(vec2), end(vec2), back_inserter(vec3));
You may get better performance with a nonstandard algorithm if you do not need to know which elements are different, but only the number of common elements, because then you can avoid having to create new copies of the common elements.
In any case, it seems to me that starting by sorting both containers will give you the best performance for data sets with more than a few dozen elements.
Here's an attempt at writing an algorithm that just gives you the count of matching elements (untested):
auto it1 = begin(vec1);
auto it2 = begin(vec2);
const auto end1 = end(vec1);
const auto end2 = end(vec2);
sort(it1, end1);
sort(it2, end2);
size_t numCommonElements = 0;
while (it1 != end1 && it2 != end2) {
bool oneIsSmaller = *it1 < *it2;
if (oneIsSmaller) {
it1 = lower_bound(it1, end1, *it2);
} else {
bool twoIsSmaller = *it2 < *it1;
if (twoIsSmaller) {
it2 = lower_bound(it2, end2, *it1);
} else {
// none of the elements is smaller than the other
// so it's a match
++it1;
++it2;
++numCommonElements;
}
}
}
Upvotes: 2
Reputation: 1
I believe you use a 2D tree to search in 2 dimenstions. An optimal algorithm to the problem you specified would fall under the class of geometric algorithms. Maybe this link is of use to you: http://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/geosearch.pdf .
Upvotes: 0