Schmelix
Schmelix

Reputation: 93

How can I efficiently check, if vector a contains same elements than vector b

I search for an efficient way to look if vector A contains same elements than vector B. Both vectors have different sizes and each element is another vector with two elements (x and y coordinates). I need the position of the equal Elements in vector A. At the moment I am doing it with for loops, but the vector b can have up to 8000 elements and my program is really slow at the moment. I read about the algorithm library, but I could not find something that would help me or I did not understand it.

std::vector<std::vector<int>> VecA; 
std::vector<std::vector<int>> VecB; //size of VecB >> VecA

for( int i = 0; i < VecA; i++)
{
   for( int z = 0; z < VecB; z++)
   {
     if (VecA.at(i) == VecB.at(z))
     {
       Do Something with VecA.at(i)
     }
   }

}

Thank you for your help.

Upvotes: 0

Views: 2599

Answers (2)

R Sahu
R Sahu

Reputation: 206737

My suggestion:

  1. Sort both the vectors - that takes O(log(N)) operations.
  2. Iterate over both the sorted vectors together and compare them. If you find a match, increment both iterators. If they don't match, increment the iterator that is less. This should take O(N) operations.
  3. Stop when one of the iterators reaches the end.

Upvotes: 0

Jack
Jack

Reputation: 133669

Some suggestions:

  • don't use a std::vector<int> for a pair of values, use a std::pair<int,int> or a custom structure
  • don't use std::vector if you want a fast way to check if a collection contains an element from the other, but a different data structure, eg a std::unordered_set<Coordinate>

With a std::unordered_set<Coordinate> everything would be much more efficient. Suppose:

struct Coordinate {
  int x;
  int y;
}

Now provide a custom std::hash<Coordinate> specialization that creates a size_t from your Coordinate object and use std::set_intersection or a custom loop that compares the elements.

If you have some range restriction about coordinates, so that both x and y fit 16 bits (eg. [0,65536)) then hashing is trivial (x << 16 | y), and what's better is that it's unambiguous (two elements with same key will be the same element), which gives more room for optimizations.

Upvotes: 2

Related Questions