Steg Verner
Steg Verner

Reputation: 913

performing vector intersection in C++

I have a vector of vector of unsigned. I need to find the intersection of all these vector of unsigned's for doing so I wrote the following code:

int func()
{
   vector<vector<unsigned> > t;
   vector<unsigned> intersectedValues;
   bool firstIntersection=true;
   for(int i=0;i<(t).size();i++)
   {
       if(firstIntersection)
       {
           intersectedValues=t[0];
           firstIntersection=false;
       }else{
           vector<unsigned> tempIntersectedSubjects;                                                              
           set_intersection(t[i].begin(),
                  t[i].end(), intersectedValues.begin(),
                  intersectedValues.end(),
                  std::inserter(tempIntersectedSubjects, tempIntersectedSubjects.begin()));
           intersectedValues=tempIntersectedSubjects;
       }         
       if(intersectedValues.size()==0)
           break;
   }               
}

Each individual vector has 9000 elements and there are many such vectors in "t". When I profiled my code I found that set_intersection takes the maximum amount of time and hence makes the code slow when there are many invocations of func(). Can someone please suggest as to how can I make the code more efficient.

I am using: gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)

EDIT: Individual vectors in vector "t" are sorted.

Upvotes: 5

Views: 3174

Answers (5)

demo
demo

Reputation: 11

set::set_intersection can be rather slow for large vectors. It's possible use create a similar function that uses lower_bound. Something like this:

template<typename Iterator1, typename Iterator2, typename Function>
void lower_bound_intersection(Iterator1 begin_1, Iterator1 end_1, Iterator2 begin_2, Iterator2 end_2, Function func)
{
    for (; begin_1 != end_1 && begin_2 != end_2;)
    {
        if (*begin_1 < *begin_2)
        {
            begin_1 = begin_1.lower_bound(*begin_2);
            //++begin_1;
        }
        else if (*begin_2 < *begin_1)
        {
            begin_2 = begin_2.lower_bound(*begin_1);
            //++begin_2;
        }
        else // equivalent
        {
            func(*begin_1);
            ++begin_1;
            ++begin_2;
        }
    }
}

Upvotes: 1

Steephen
Steephen

Reputation: 15824

To copy elements from vectors from vector for loop with emplace_back() may save your time. And no need of a flag if you change the iterator index of for loop. So for loop can be optimized, and condition check can be removed for each iteration.

void func()
{
   vector<vector<unsigned > > t;
   vector<unsigned int > intersectedValues;

   for(unsigned int i=1;i<(t).size();i++)
   {
        intersectedValues=t[0];

        vector<unsigned > tempIntersectedSubjects;                                                              
        set_intersection(t[i].begin(),
                  t[i].end(), intersectedValues.begin(),
                  intersectedValues.end(),
                   std::back_inserter(tempIntersectedSubjects);
        for(auto &ele: tempIntersectedSubjects)
                intersectedValues.emplace_back(ele);

        if( intersectedValues.empty())
          break;
   }               
}

Upvotes: 1

Baum mit Augen
Baum mit Augen

Reputation: 50053

You can make std::set_intersection as well as a bunch of other standard library algorithms run in parallel by defining _GLIBCXX_PARALLEL during compilation. That probably has the best work-gain ratio. For documentation see this.

Obligatory pitfall warning:

Note that the _GLIBCXX_PARALLEL define may change the sizes and behavior of standard class templates such as std::search, and therefore one can only link code compiled with parallel mode and code compiled without parallel mode if no instantiation of a container is passed between the two translation units. Parallel mode functionality has distinct linkage, and cannot be confused with normal mode symbols.

from here.

Another simple, though probably insignificantly small, optimization would be reserving enough space before filling your vectors.

Also, try to find out whether inserting the values at the back instead of the front and then reversing the vector helps. (Although I even think that your code is wrong right now and your intersectedValues is sorted the wrong way. If I'm not mistaken, you should use std::back_inserter instead of std::inserter(...,begin) and then not reverse.) While shifting stuff through memory is pretty fast, not shifting should be even faster.

Upvotes: 2

Galik
Galik

Reputation: 48615

I can't test this but maybe something like this would be faster?

int func()
{
   vector<vector<unsigned> > t;
   vector<unsigned> intersectedValues;

   // remove if() branching from loop

   if(t.empty())
       return -1;

   intersectedValues = t[0];

   // now start from 1
   for(size_t i = 1; i < t.size(); ++i)
   {
       vector<unsigned> tempIntersectedSubjects;
       tempIntersectedSubjects.reserve(intersectedValues.size()); // pre-allocate

       // insert at end() not begin()
       set_intersection(t[i].begin(),
              t[i].end(), intersectedValues.begin(),
              intersectedValues.end(),
              std::inserter(tempIntersectedSubjects, tempIntersectedSubjects.end()));

       // as these are not used again you can move them rather than copy
       intersectedValues = std::move(tempIntersectedSubjects);

       if(intersectedValues.empty())
           break;
   }
   return 0;
}

Another possibility:

Thinking about it using swap() could optimize the exchange of data and remove the need to re-allocate. Also then the temp constructor can be moved out of the loop.

int func()
{
   vector<vector<unsigned> > t;
   vector<unsigned> intersectedValues;

   // remove if() branching from loop

   if(t.empty())
       return -1;

   intersectedValues = t[0];

   // no need to construct this every loop
   vector<unsigned> tempIntersectedSubjects;

   // now start from 1
   for(size_t i = 1; i < t.size(); ++i)
   {
       // should already be the correct size from previous loop
       // but just in case this should be cheep
       // (profile removing this line)
       tempIntersectedSubjects.reserve(intersectedValues.size());

       // insert at end() not begin()
       set_intersection(t[i].begin(),
              t[i].end(), intersectedValues.begin(),
              intersectedValues.end(),
              std::inserter(tempIntersectedSubjects, tempIntersectedSubjects.end()));

       // swap should leave tempIntersectedSubjects preallocated to the
       // correct size
       intersectedValues.swap(tempIntersectedSubjects);
       tempIntersectedSubjects.clear(); // will not deallocate

       if(intersectedValues.empty())
           break;
   }
   return 0;
}

Upvotes: 2

Dietmar K&#252;hl
Dietmar K&#252;hl

Reputation: 153820

I don't have a framework to profile the operations but I'd certainly change the code to reuse the readily allocated vector. In addition, I'd hoist the initial intersection out of the loop. Also, std::back_inserter() should make sure that elements are added in the correct location rather than in the beginning:

int func()
{
    vector<vector<unsigned> > t = some_initialization();
    if (t.empty()) {
        return;
    }
    vector<unsigned> intersectedValues(t[0]);
    vector<unsigned> tempIntersectedSubjects;
    for (std::vector<std::vector<unsigned>>::size_type i(1u);
         i < t.size() && !intersectedValues.empty(); ++i) {
        std::set_intersection(t[i].begin(), t[i].end(),
                              intersectedValues.begin(), intersectedValues.end(),
                             std::back_inserter(tempIntersectedSubjects);
        std::swap(intersectedValues, tempIntersectedSubjects);
        tempIntersectedSubjects.clear();
    }
}               

I think this code has a fair chance to be faster. It may also be reasonable to intersect the sets different: instead of keeping one set and intersecting with that you could create a new intersection for pairs of adjacent sets and then intersect the first sets with their respect adjacent ones:

std::vector<std::vector<unsigned>> intersections(
    std::vector<std::vector<unsigned>> const& t) {
    std::vector<std::vector<unsigned>> r;
    std::vector<std::vector<unsignned>>::size_type i(0);
    for (; i + 1 < t.size(); i += 2) {
        r.push_back(intersect(t[i], t[i + 1]));
    }
    if (i < t.size()) {
        r.push_back(t[i]);
    }
    return r;
}

std::vector<unsigned> func(std::vector<std::vector<unsigned>> const& t) {
    if (t.empty()) { /* deal with t being empty... */ }
    std::vector<std::vector<unsigned>> r(intersections(t))
    return r.size() == 1? r[0]: func(r);
}

Of course, you wouldn't really implement it like this: you'd use Stepanov's binary counter to keep the intermediate sets. This approach assumes that the result is most likely non-empty. If the expectation is that the result will be empty that may not be an improvement.

Upvotes: 3

Related Questions