How to optimize a std::set intersection algorithm (C++)

Question

I'm struggling with a part of my college assignment. I have two subsets of std::set containers containing pointers to quite complex objects, but ordered by different criteria (which is why I can't use std::set_intersection()). I need to find elements that are contained in both subsets as fast as possible. There is a time/complexity requirement on the assignment.

I can do that in n*log(m) time where n is the size of the first subset and m is the size of the second subset by doing the following:

for(auto it = subset1.begin(), it != subset1.end(), it++){
    if(find(subset2.begin(), subset2.end(), *it))
        result.insert(*it);
}

This fails the time requirement, which says worst case linear, but average better than linear.

I found the following question here and I find the hashtable approach interesting. However, I fear that the creation of the hashtable might incur too much overhead. The class contained in the sets looks something like this:

class containedInSets {
   //methods
private: 
    vector member1;
    SomeObject member2;
    int member3;
}

I have no control over the SomeObject class, and therefore cannot specify a hash function for it. I'd have to hash the pointer. Furthermore, the vector may grow quite (in the thousands of entries).

What is the quickest way of doing this?

How to optimize a std::set intersection algorithm (C++)

Answers (1)

Related Questions