Atul
Atul

Reputation: 4320

Efficient way to get subset of std::set

I call a library function that accepts pointer to std::set and processes elements of it.

However, it processes only certain number of elements (let's say 100) and if the set has more elements it simply throws an exception. However, I receive set of much larger size. So I need efficient way to get subset of std::set.

Currently, I am copying 100 elements to temporary set and pass it to the function.

struct MyClass
{
    // Class having considerably large size instance
};

// Library function that processes set having only 100 elements at a time
void ProcessSet (std::set<MyClass>* ptrMyClassObjectsSet);

void FunctionToProcessLargeSet (std::set<MyClass>& MyClassObjSet)
{
    std::set<MyClass> MyClass100ObjSet;

    // Cannot pass MyClassObject as it is to ProcessSet as it might have large number of elements
    // So create set of 100 elements and pass it to the function
    std::set<MyClass>::iterator it;
    for (it = MyClassObjSet.begin(); it != MyClassObjSet.end(); ++it)
    {
        MyClass100ObjSet.insert (*it);

        if (MyClass100ObjSet.size() == 100)
        {
            ProcessSet (&MyClass100ObjSet);
            MyClass100ObjSet.clear();
        }
    }

    // Prrocess remaining elments
    ProcessSet (&MyClass100ObjSet);
    MyClass100ObjSet.clear();
}

But it's impacting performance. Can anyone please suggest better ways to do this?

Upvotes: 0

Views: 3489

Answers (2)

NathanOliver
NathanOliver

Reputation: 180490

Since it looks like you are locked into having to use a subset. I have tweaked your code a little and I think it might be faster for you. It is still an O(n) operation but there is no branching in the for loop which should increase performance.

void FunctionToProcessLargeSet(std::set<MyClass>& MyClassObjSet)
{
    int iteration = MyClassOgjSet.size() / 100; // get number of times we have collection of 100
    auto it = MyClassObjSet.begin();
    auto end = MyClassObjSet.begin();
    for (; iteration == 0; --iteration)
    {
        std::advance(end, 100); // move end 100 away
        std::set<MyClass> MyClass100ObjSet(it, std::advance(it, end));  // construct with iterator range
        std::advance(it, 100); // advace it to end pos
        ProcessSet(&MyClass100ObjSet); // process subset
    }
    if (MyClassOgjSet.size() % 100 != 0)  // get last subset
    {
        std::set<MyClass> MyClass100ObjSet(it, MyClassObjSet.end());
        // Prrocess remaining elments
        ProcessSet(&MyClass100ObjSet);
    }
}

Let me know if this runs faster for you.

Upvotes: 1

Rostislav
Rostislav

Reputation: 3977

Well, that sounds like a bad library design, but if you have to work with what you have then:

  • If library can accept a pair of iterators - that's the easy way to go using std::advance
  • If it's templated and can accept std::set<T>, then copying a part of your set to std::set<std::reference_wrapper<T>> might perform better if copying T is slow (see here to see that no copies are created)
  • If it only acceptsstd::set<ParticularObjectType>, I don't see a way around copying the data.

Hope this helps,

Rostislav.

Upvotes: 1

Related Questions