Pedro Batista
Pedro Batista

Reputation: 1108

Check for common members in vector c++

What is the best way to verify if there are common members within multiple vectors? The vectors aren't necessarily of equal size and they may contain custom data (such as structures containing two integers that represent a 2D coordinate).

For example:

vec1 = {(1,2); (3,1); (2,2)};
vec2 = {(3,4); (1,2)};

How to verify that both vectors have a common member?

Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.

Upvotes: 2

Views: 1755

Answers (3)

Vincent
Vincent

Reputation: 444

Note that I am trying to avoid inneficient methods such as going through all elements and check for equal data.

You need to go through all elements at least once, I assume you're implying you don't want to check every combinations. Indeed you don't want to do :
for all elements in vec1, go through the entire vec2 to check if the element is here. This won't be efficient if your vectors have a big number of elements.

If you prefer a linear time solution and you don't mind using extra memory here is what you can do :

You need a hashing function to insert element in an unordered_map or unordered_set See https://stackoverflow.com/a/13486174/2502814

// next_permutation example
#include <iostream>     // std::cout
#include <unordered_set> // std::unordered_set
#include <vector>       // std::vector

using namespace std;


namespace std {
  template <>
  struct hash<pair<int, int>>
  {
    typedef pair<int, int>      argument_type;
    typedef std::size_t  result_type;

    result_type operator()(const pair<int, int> & t) const
    {
       std::hash<int> int_hash;
       return int_hash(t.first + 6495227 * t.second);
    }
  };
}


int main () {
  vector<pair<int, int>> vec1 {{1,2}, {3,1}, {2,2}};
  vector<pair<int, int>> vec2 {{3,4}, {1,2}};


  // Copy all elements from vec2 into an unordered_set
  unordered_set<pair<int, int>> in_vec2;
  in_vec2.insert(vec2.begin(),vec2.end());

  // Traverse vec1 and check if elements are here
  for (auto& e : vec1)
  {
    if(in_vec2.find(e) != in_vec2.end()) // Searching in an unordered_set is faster than going through all elements of vec2 when vec2 is big.
    {
      //Here are the elements in common:
      cout << "{" << e.first << "," <<  e.second << "} is in common!" << endl;
    }

  }

  return 0;
}

Output : {1,2} is in common!

You can either do that, or copy all elements of vec1 into an unordered_set, and then traverse vec2. Depending on the sizes of vec1 and vec2, one solution might be faster than the other. Keep in mind that picking the smaller vector to insert in the unordered_set also means you will use less extra memory.

Upvotes: 1

Martin J.
Martin J.

Reputation: 5118

For non-trivial data sets, the most efficient method is probably to sort both vectors, and then use std::set_intersection function defined in , like follows:

#include <vector>
#include <algorithm>


using namespace std;

typedef vector<pair<int, int>> tPointVector;

tPointVector vec1 {{1,2}, {3,1}, {2,2}};
tPointVector vec2 {{3,4}, {1,2}};

std::sort(begin(vec1), end(vec1));
std::sort(begin(vec2), end(vec2));

tPointVector vec3;
vec3.reserve(std::min(vec1.size(), vec2.size()));
set_intersection(begin(vec1), end(vec1), begin(vec2), end(vec2), back_inserter(vec3));

You may get better performance with a nonstandard algorithm if you do not need to know which elements are different, but only the number of common elements, because then you can avoid having to create new copies of the common elements.
In any case, it seems to me that starting by sorting both containers will give you the best performance for data sets with more than a few dozen elements.

Here's an attempt at writing an algorithm that just gives you the count of matching elements (untested):

auto it1 = begin(vec1);
auto it2 = begin(vec2);
const auto end1 = end(vec1);
const auto end2 = end(vec2);
sort(it1, end1);
sort(it2, end2);

size_t numCommonElements = 0;
while (it1 != end1 && it2 != end2) {
    bool oneIsSmaller = *it1 < *it2;
    if (oneIsSmaller) {
        it1 = lower_bound(it1, end1, *it2);
    } else {
        bool twoIsSmaller = *it2 < *it1;
        if (twoIsSmaller) {
            it2 = lower_bound(it2, end2, *it1);
        } else {
            // none of the elements is smaller than the other
            // so it's a match
            ++it1;
            ++it2;
            ++numCommonElements;
        }
    }
}

Upvotes: 2

aneez
aneez

Reputation: 1

I believe you use a 2D tree to search in 2 dimenstions. An optimal algorithm to the problem you specified would fall under the class of geometric algorithms. Maybe this link is of use to you: http://www.cs.princeton.edu/courses/archive/fall05/cos226/lectures/geosearch.pdf .

Upvotes: 0

Related Questions