stgtscc
stgtscc

Reputation: 990

Multithreaded read-many, write-seldom array/vector iteration in C++

I have a need to almost-constantly iterate over a sequence of structs in a read-only fashion but for every 1M+ reads, one of the threads may append an item. I think using a mutex would be overkill here and I also read somewhere that r/w locks have their own drawbacks for the readers.

I was thinking about using reserve() on a std::vector but this answer Iterate over STL container using indices safe way to avoid using locks? seemed to invalidate that.

Any ideas on what way might be fastest? The most important thing is for the readers to be able to quickly and efficiently iterate with as little contention as possible. The writing operations aren't time-sensitive.

Update: Another one of my use cases is that the "list" could contain pointers rather than structs. I.e, std::vector. Same requirements apply.

Update 2: Hypothetic example

globally accessible:

typedef std::vector<MyClass*> Vector;
Vector v;
v.reserve(50);

Reader threads 1-10: (these run pretty much run all the time)

.
.
int total = 0;
for (Vector::const_iterator it = v.begin(); it != v.end(); ++it)
{
   MyClass* ptr = *it;
   total += ptr->getTotal();
}
// do something with total
.
.

Writer threads 11-15:

MyClass* ptr = new MyClass();
v.push_back(ptr);

That's basically what happens here. threads 1-15 could all be running concurrently although generally there are only 1-2 reading threads and 1-2 writer threads.

Upvotes: 1

Views: 512

Answers (2)

user2116939
user2116939

Reputation: 454

What I think could work here is own implementation of vector, something like this:

template <typename T> class Vector
{
// constructor will be needed of course
public:
    std::shared_ptr<const std::vector<T> > getVector()
        { return mVector; }
    void push_back(const T&);

private:
    std::shared_ptr<std::vector<T> > mVector;
};

Then, whenever readers need to access a specific Vector, they should call getVector() and keep the returned shared_ptr until finished reading.

But writers should always use Vector's push_back to add new value. This push_back should then check if mVector.size() == mVector.capacity() and if true, allocate new vector and assign it to mVector. Something like:

template <typename T> Vector<T>::push_back(const T& t)
{
    if (mVector->size() == mVector->capacity())
    {
        // make certain here that new_size > old_size
        std::vector<T> vec = new std::vector<T> (mVector->size() * SIZE_MULTIPLIER);

        std::copy(mVector->begin(), mVector->end(), vec->begin());
        mVector.reset(vec);
    }
// put 't' into 'mVector'. 'mVector' is guaranteed not to reallocate now.
}

The idea here is inspired by RCU (read-copy-update) algorithm. If storage space is exhausted, the new storage should not invalidate the old storage as long as there is at least one reader accessing it. But, the new storage should be allocated and any reader coming after allocation, should be able to see it. The old storage should be deallocated as soon as no one is using it anymore (all readers are finished).

Since most HW architectures provide some way to have atomic increments and decrements, I think shared_ptr (and thus Vector) will be able to run completely lock-less.

One disadvantage to this approach though, is that depending on how long readers hold that shared_ptr you might end up with several copies of your data.

PS: hope I haven't made too many embarrassing errors in the code :-)

Upvotes: 4

Useless
Useless

Reputation: 67752

... using reserve() on a std::vector ...

This can only be useful if you can guarantee the vector will never need to grow. You've stated that the number if items is not bounded above, so you can't give that guarantee.

Notwithstanding the linked question, you could conceivable use std::vector just to manage memory for you, but it would take an extra layer of logic on top to work around the problems identified in the accepted answer.


The actual answer is: the fastest thing to do is minimize the amount of synchronization. What the minimal amount of synchronization is depends on details of your code and usage that you haven't specified.


For example, I sketched a solution using a linked-list of fixed-size chunks. This means your common use case should be as efficient as an array traversal, but you're able to grow dynamically without re-allocating.

However, the implementation turns out to be sensitive to questions like:

  • whether you need to remove items
    • whenever they're read?
    • only from the front or from other places?
  • whether you want the reader to busy-wait if the container is empty
    • whether this should use some kind of backoff
  • what degree of consistency is required?

Upvotes: 0

Related Questions