Reputation: 10487
I'm using C11* atomics to manage a state enum between a few threads. The code resembles the following:
static _Atomic State state;
void setToFoo(void)
{
atomic_store_explicit(&state, STATE_FOO, memory_order_release);
}
bool stateIsBar(void)
{
return atomic_load_explicit(&state, memory_order_acquire) == STATE_BAR;
}
This assembles (for an ARM Cortex-M4) to:
<setToFoo>:
ldr r3, [pc, #8]
dmb sy ; Memory barrier
movs r2, #0
strb r2, [r3, #0] ; store STATE_FOO
bx lr
.word 0x00000000
<stateIsBar>:
ldr r3, [pc, #16]
ldrb r0, [r3, #0] ; load state
dmb sy ; Memory barrier
sub.w r0, r0, #2 ; Comparison and return follows
clz r0, r0
lsrs r0, r0, #5
bx lr
.word 0x00000000
Why are the fences placed before the release and after the acquire? My mental model assumed that a barrier would be placed after after a release (to "propagate" the variable being stored and all other stores to other threads) and before an acquire (to receive all previous stores from other threads).
*While this particular example is given in C11, the situation is identical in C++11, as the two share the same concepts (and even the same enums) when it comes to memory ordering. gcc
and g++
emit the same machine code in this situation. See http://en.cppreference.com/w/c/atomic/memory_order and http://en.cppreference.com/w/cpp/atomic/memory_order
Upvotes: 4
Views: 974
Reputation: 3917
The memory fence before the store is to guarantee that the store isn't ordered before any prior stores. Similarly, the memory fence after the read guarantees that the read isn't ordered after any following reads. When you combine the two, it creates a synchronizes-with relation between the writes and reads.
T1: on-deps(A) -> fence -> write(A)
T2: read(A) -> fence -> deps-on(A)
read(A) happens before deps-on(A)
write(A) happens after on-deps(A)
If you change the order of either fence, the sequence of dependencies is broken which obviously will cause inconsistent results (e.g. race conditions).
Some more possible reading...
Upvotes: 5