shn
shn

Reputation: 5286

Quickest way to compute the number of shared elements between two vectors

Suppose I have two vectors of the same size vector< pair<float, NodeDataID> > v1, v2; I want to compute how many elements from both v1 and v2 have the same NodeDataID. For example if v1 = {<3.7, 22>, <2.22, 64>, <1.9, 29>, <0.8, 7>}, and v2 = {<1.66, 7>, <0.03, 9>, <5.65, 64>, <4.9, 11>}, then I want to return 2 because there are two elements from v1 and v2 that share the same NodeDataIDs: 7 and 64.

What is the quickest way to do that in C++ ?

Just for information, note that the type NodeDataIDs is defined as I use boost as:

typedef adjacency_list<setS, setS, undirectedS, NodeData, EdgeData> myGraph;
typedef myGraph::vertex_descriptor NodeDataID;

But it is not important since we can compare two NodeDataID using the operator == (that is, possible to do v1[i].second == v2[j].second)

Upvotes: 0

Views: 270

Answers (2)

shn
shn

Reputation: 5286

The simplest solution is just to put elements of the first vector in a set then for the second vector we insert each element in this set (ret = myset.insert(an_id)) and if ret.second is false then the element exists, thus we increase a counter.

set<NodeDataID> myset;
int counter = 0;

for(int i = 0; i < v1.size(); ++i)
   myset.insert(v1[i].second);

for(int i = 0; i < v2.size(); ++i)
{
   pair<set<NodeDataID>::iterator,bool> ret = myset.insert(v2[i].second);
   if(ret.second == false)
      ++counter;
}

Upvotes: 0

Oswald
Oswald

Reputation: 31685

Put the elements of the first vector into a hash table. Iterate over the second vector, testing each element whether it is in the hash table.

A hash table has the advantage that inserts and lookups can be done in constant time. This means, finding the intersection can be done in linear time. This is optimal, because regardless of the algorithm, you have to look at each vector element at least once.

Boost has boost::intrusive::hashtable, but it's (as the name suggests), intrusive.

Upvotes: 2

Related Questions