2607
2607

Reputation: 4125

multithreading, inter-thread communication, synchronization

One class XYZ has 3 member variable x, y, z. There is vector of [N] XYZ objects.

There are 3 threads A, B, C, they can access any objects in the vector and any member variable of that object.

class XYZ
{
public:
double x;
double y;
double z;
};

std::vector with N elements, where N is fixed throughout the program.

How to design the inter-thread communication to achieve thread-safety and maximum efficiency, i.e., minimum blocking.

Here is some of my thinking process, please correct me if I am wrong.

  1. Divide vector in smaller vectors and encapsulate into each thread class, then use a message queue to pass data around. The problems are all 3 threads can access anywhere of the vector and objects member, therefore it is difficult to sub-divide and encapsulate. Message queue itself needs blocking, i.e., reader needs to be blocked when sender is adding to the queue.
  2. Use atomic library to make access atomic, therefore avoiding blocking. The problem is atomic is OS dependent, i.e., some operation is considered atomic under Linux may not be atomic under Windows.
  3. Mutex, add 3 mutex objects for each member variable, e.g., mutex_x, mutex_y, mutex_z. However, the problem is that mutex is noncopyable,

i.e., if we have a class like,

class XYZ_mutex
{
public:
double x;
double y;
double z;
boost::mutex mutex_x;
boost::mutex mutex_y;
boost::mutex mutex_z;
};

We can NOT have a vector of XYZ_mutex, because .push_back() is a copy constructor.

Thanks.

Upvotes: 0

Views: 2618

Answers (4)

Adrian
Adrian

Reputation: 5681

Use option 1. But also split the vector into 3 parts and allow each thread to work on its chunk in order to not interfere with the other threads. You can always use the length of the vector to create these virtual boundaries. Heck, you can have each thread with its own vector. Then when you do an insert into the vector, send your request through a "load balancer" so it inserts in the right vector (just go greedy). You could also use the balancer to rebalance the vectors, given that some thread is slower than the rest (move items from 1 vector to the others)

Upvotes: 0

AndrzejJ
AndrzejJ

Reputation: 740

You need to consider what usage patterns and what consistency requirements your program has.

First and most important usage pattern is whether any of the threads needs to modify these structures. If not, you don't actually need any locking - just make sure the structure is populated before the threads start reading it.

If the threads do have to modify the structure, you need to consider consistency. You need to ask yourself, if there are any constraints, other than modifying an individual double value, that limit when one thread can see changes made by another thread.

Once you define that, start thinking if data can be split between threads in any way to reduce conflict between threads - even if it can't be completely eliminated, the answer the question whether it's better to keep three vectors of doubles, a vector of XYZ or some other organisation depenst on whether there will be some threads that access certain XYZ objects more often, or if there will be threads that access x members more often, while others access ys or zs more often etc.

If you can't say anything about the probability of accessing any member of any instance by any thread, it's hard to say how to best organise them. It might even be a good idea to place them on the heap so that they land in separate cache lines.

Overall, probably the best suggestion is to start by putting them in any data structure that's best suited for your needs (vector, map, set etc.), using a single mutex to synchronise the whole thing, writing your program and then testing if you experience a bottleneck.

Upvotes: 2

James Kanze
James Kanze

Reputation: 154047

There is no one right answer. The basic rules are:

  1. Try to organize things so that no thread every requires more than one mutex. Otherwise, you need some (usually non-obvious) solution to avoid the risk of a deadlock.

  2. Try to keep each mutex for as short a time as possible. If you can organize things so that there are n distinct arrays, each thread only accesses a single distinct array, and each distinct array is only accessed by a single class, this is optimal, since it means that each thread can act without locks, once it has acquired ownership of its distinct array. (It doesn't sound like this is the case, however.)

A lot will depend on how (and how often) the threads are accessing the elements of the vector.

Finally, there are several ways of handling the problem of mutexes in the XYZ object. The most obvious is to use a recent compiler, which supports move semantics; std::mutex is movable, and can be used in a std::vector. Failing that, you can use a boost::shared_ptr<> to the mutex.

Upvotes: 1

Tudor
Tudor

Reputation: 62469

I would go for option 1 (with a few changes, see below), simply because having a mutex for each item in the array is too costly.

So, it's a good idea to divide your array into smaller chunks, each protected by a mutex. Then, when a thread needs to access some part of the array, it can access another data structure that provides it with the mutex that it needs to lock depending on the interval in which the accessed item is (for example a hash table associating the index of the item with the mutex for that specific interval).

Upvotes: 1

Related Questions