Adam Romanek
Adam Romanek

Reputation: 1879

C++ threading vs. visibility issues - what's the common engineering practice?

From my studies I know the concepts of starvation, deadlock, fairness and other concurrency issues. However, theory differs from practice, to an extent, and real engineering tasks often involve greater detail than academic blah blah...

As a C++ developer I've been concerned about threading issues for a while...

Suppose you have a shared variable x which refers to some larger portion of the program's memory. The variable is shared between two threads A and B.

Now, if we consider read/write operations on x from both A and B threads, possibly at the same time, there is a need to synchronize those operations, right? So the access to x needs some form of synchronization which can be achieved for example by using mutexes.

Now lets consider another scenario where x is initially written by thread A, then passed to thread B (somehow) and that thread only reads x. The thread B then produces a response to x called y and passes it back to the thread A (again, somehow). My question is: what synchronization primitives should I use to make this scenario thread-safe. I've read about atomics and, more importantly, memory fences - are these the tools I should rely on?

This is not a typical scenario in which there is a "critical section". Instead some data is passed between threads with no possibility of concurrent writes in the same memory location. So, after being written, the data should first be "flushed" somehow, so that the other threads could see it in a valid and consistent state before reading. How is it called in the literature, is it "visibility"?

What about pthread_once and its Boost/std counterpart i.e. call_once. Does it help if both x and y are passed between threads through a sort of "message queue" which is accessed by means of "once" functionality. AFAIK it serves as a sort of memory fence but I couldn't find any confirmation for this.

What about CPU caches and their coherency? What should I know about that from the engineering point of view? Does such knowledge help in the scenario mentioned above, or any other scenario commonly encountered in C++ development?

I know I might be mixing a lot of topics but I'd like to better understand what is the common engineering practice so that I could reuse the already known patterns.

This question is primarily related to the situation in C++03 as this is my daily environment at work. Since my project mainly involves Linux then I may only use pthreads and Boost, including Boost.Atomic. But I'm also interested if anything concerning such matters has changed with the advent of C++11.

I know the question is abstract and not that precise but any input could be useful.

Upvotes: 3

Views: 452

Answers (1)

Ben Voigt
Ben Voigt

Reputation: 283684

you have a shared variable x

That's where you've gone wrong. Threading is MUCH easier if you hand off ownership of work items using some sort of threadsafe consumer-producer queue, and from the perspective of the rest of the program, including all the business logic, nothing is shared.

Message passing also helps prevent cache collisions (because there is no true sharing -- except of the producer-consumer queue itself, and that has trivial effect on performance if the unit of work is large -- and organizing the data into messages help reduce false sharing).

Parallelism scales best when you separate the problem into subproblems. Small subproblems are also much easier to reason about.

You seem to already be thinking along these lines, but no, threading primitives like atomics, mutexes, and fences are not very good for applications using message passing. Find a real queue implementation (queue, circular ring, Disruptor, they go under different names but all meet the same need). The primitives will be used inside the queue implementation, but never by application code.

Upvotes: 10

Related Questions