I'm trying to get into something deeper to better understand how many options do I have when writing multi-threaded applications in C++ 11. In short I see this 3 options so far: mutexes with explicit locking and freeing mechanism, they keep the threading in sync by locking and freeing, this is costly and doesn't guarantee the ordering of the execution of my code, but often times this solution is quite portable among different memory models. atomic operations, since atomic = 1single operation without a race and it is always consistent, the sync is accomplished without locking and freeing, there is no need for locking without a race, with highly optimized atomic operations, but atomics still can't guarantee the order in which my code will be executed. fences, they create a block in my code where nothing can't be re-ordered by the compiler, are less flexible and they tend to be costly in terms of code maintenance because I always have to keep an eye on what is really being executed and in what order, but they also improve caching techniques and among this 3 solutions they are probably the one with the most predictable behaviour. This is more or less the core of what I got from the first lessons about threading and memory models, my problems is: I was going for lockfree data structures and atomics to achieve flexibility and good performances, the problem here is the fact that apparently an X86 machine performs memory re-ordering differently from an ARM one and I would like to keep my code portable as much as possible at least across this 2 platforms, so what kind of approach you can suggest to write a portable multi-threaded software when 2 platforms are not guarantee to have the same re-ordering mechanisms ? Or atomic operations are the best choice as it is by now and I got all this wrong ? For example I noticed that the Intel TBB library ( which is not C++11 code ) is being ported to ARM/Android with heavy modifications on the part dedicated to the atomic, so maybe I can write portable multi-threaded code in C++11, with lockfree data structures, and optimize the part about atomic later on when porting my library to another platform ?

multithreadingc++11mutexatomicmemory-fences

Reputation: 9801

Mutexes, atomic and fences : what offers the best tradeoff and portability ? C++11

I'm trying to get into something deeper to better understand how many options do I have when writing multi-threaded applications in C++ 11.

In short I see this 3 options so far:

mutexes with explicit locking and freeing mechanism, they keep the threading in sync by locking and freeing, this is costly and doesn't guarantee the ordering of the execution of my code, but often times this solution is quite portable among different memory models.
atomic operations, since atomic = 1single operation without a race and it is always consistent, the sync is accomplished without locking and freeing, there is no need for locking without a race, with highly optimized atomic operations, but atomics still can't guarantee the order in which my code will be executed.
fences, they create a block in my code where nothing can't be re-ordered by the compiler, are less flexible and they tend to be costly in terms of code maintenance because I always have to keep an eye on what is really being executed and in what order, but they also improve caching techniques and among this 3 solutions they are probably the one with the most predictable behaviour.

This is more or less the core of what I got from the first lessons about threading and memory models, my problems is:

I was going for lockfree data structures and atomics to achieve flexibility and good performances, the problem here is the fact that apparently an X86 machine performs memory re-ordering differently from an ARM one and I would like to keep my code portable as much as possible at least across this 2 platforms, so what kind of approach you can suggest to write a portable multi-threaded software when 2 platforms are not guarantee to have the same re-ordering mechanisms ? Or atomic operations are the best choice as it is by now and I got all this wrong ?

For example I noticed that the Intel TBB library ( which is not C++11 code ) is being ported to ARM/Android with heavy modifications on the part dedicated to the atomic, so maybe I can write portable multi-threaded code in C++11, with lockfree data structures, and optimize the part about atomic later on when porting my library to another platform ?

Upvotes: 2

Answers (2)

Thomas McGuire

Reputation: 5466

Yep, X86 and ARM have different memory models. The C++11 memory model is however not platform-specific, it has the same behavior everywhere.

That means implementation of the C++11 atomics is different on each platform - on x86, which has a fairly strong memory model, the implementation of std::atomic might get away without special assembler instructions when storing a value, while on ARM, the implementation needs special locking or fence instructions internally.

So you can simply use the atomic classes in C++11, they will work the same on all platforms. If you want to, you can even tweak the memory order if you are absolutely sure what you are doing. A weaker memory order might be faster since the implementation of the atomics might need less assembler instructions for locks and fences internally.

I can highly recommend watching Herb Sutter's talk Atomic Weapons for some detailed explanations about this.

Upvotes: 1

Andrew Tomazos

Reputation: 68698

The issues surrounding multi-threaded programming are not language-specific or architecture-specific. You are better off studying them first with a generalized view - and only after, as a second step, specializing your general understanding to specific languages, libraries, platforms, etc, etc.

The textbook required when I went to school was:

Principles of Concurrent and Distributed Programming - Ben-Ari

The second edition is 2006 I believe. There may be better ones, but this should suffice for starters.

Upvotes: 1

Mutexes, atomic and fences : what offers the best tradeoff and portability ? C++11

Answers (2)

Related Questions