Reputation: 57
I have a question about the definition of the synchronises-with relation in the C++ memory model when relaxed and acquire/release accesses are mixed on one and the same atomic variable. Consider the following example consisting of a global initialiser and three threads:
int x = 0;
std::atomic<int> atm(0);
[thread T1]
x = 42;
atm.store(1, std::memory_order_release);
[thread T2]
if (atm.load(std::memory_order_relaxed) == 1)
atm.store(2, std::memory_order_relaxed);
[thread T3]
int value = atm.load(std::memory_order_acquire);
assert(value != 1 || x == 42); // Hopefully this is guaranteed to hold.
assert(value != 2 || x == 42); // Does this assert hold necessarily??
My question is whether the second assert in T3
can fail under the C++ memory model. Note that the answer to this SO question suggests that the assert could not fail if T2
used load/acquire and store/release; please correct me if I got this wrong. However, as stated above, the answer seems to depend on how exactly the synchronises-with relation is defined in this case. I was confused by the text on cppreference, and I came up with the following two possible readings.
The second assert fails. The store to atm
in T1
could be conceptually understood as storing 1_release
where _release
is annotation specifying how the value was stored; along the same lines, the store in T2
could be understood as storing 2_relaxed
. Hence, if the load in T3
returns 2
, the thread actually read 2_relaxed
; thus, the load in T3
does not synchronise-with the store in T1
and there is no guarantee that T3
sees x == 42
. However, if the load in T3
returns 1
, then 1_release
was read, and therefore the load in T3
synchronises-with the store in T1
and T3
is guaranteed to see x == 42
.
The second assert success. If the load in T3
returns 2
, then this load reads a side-effect of the relaxed store in T2
; however, this store of T2
is present in the modification order of atm
only if the modification order of atm
contains a preceding store with a release semantics. Therefore, the load/acquire in T3
synchronises-with the store/release of T1
because the latter necessarily precedes the former in the modification order of atm
.
At first glance, the answer to this SO question seems to suggest that my reading 1 is correct. However, that answer seems to be different in a subtle way: all stores in the answer are release, and the crux of the question is to see that load/acquire and store/release establishes synchronises-with between a pair of threads. In contrast, my question is about how exactly synchronises-with is defined when memory orders are heterogeneous.
I actually hope that reading 2 is correct since this would make reasoning about concurrency easier. Thread T2
does not read or write any memory other than atm
; therefore, T2
itself has no synchronisation requirements and should therefore be able to use relaxed memory order. In contrast, T1
publishes x
and T3
consumes it -- that is, these two threads communicate with each other so they should clearly use acquire/release semantics. In other words, if interpretation 1 turns out to be correct, then the code T2
cannot be written by thinking only about what T2
does; rather, the code of T2
needs to know that it should not "disturb" synchronisation between T1
and T3
.
In any case, knowing what exactly is sanctioned by the standard in this case seems absolutely crucial to me.
Upvotes: 2
Views: 815
Reputation: 6647
Because you use relaxed ordering on a separate load & store in T2, the release sequence is broken and the second assert can trigger (although not on a TSO platform such as X86).
You can fix this by either using acq/rel ordering in thread T2 (as you suggested) or by modifying T2 to use an atomic read-modify-write operation (RMW), like this:
[Thread T2]
int ret;
do {
int val = 1;
ret = atm.compare_exchange_weak(val, 2, std::memory_order_relaxed);
} while (ret != 0);
The modification order of atm
is 0-1-2 and T3 will pick up on either 1 or 2 and no assert can fail.
Another valid implementation of T2 is:
[thread T2]
if (atm.load(std::memory_order_relaxed) == 1)
{
atm.exchange(2, std::memory_order_relaxed);
}
Here the RMW itself is unconditional and it must be accompanied by an if-statement & (relaxed) load to ensure that the modification order of atm
is 0-1 or 0-1-2
Without the if-statement, the modification order could be 0-2 which can cause the assert to fail. (This works because we know there is only one other write in the whole rest of the program. Separate if()
/ exchange
is of course not in general equivalent to compare_exchange_strong
.)
In the C++ standard, the following quotes are related:
[intro.races]
A release sequence headed by a release operation A on an atomic object M is a maximal contiguous subsequence of side effects in the modification order of M, where the first operation is A, and every subsequent operation is an atomic read-modify-write operation.
[atomics.order]
An atomic operation A that performs a release operation on an atomic object M synchronizes with an atomic operation B that performs an acquire operation on M and takes its value from any side effect in the release sequence headed by A.
this question is about why an RMW works in a release sequence.
Upvotes: 5