Reputation: 1463
Consider this code:
std::atomic<int> x{ 0 };
std::atomic<int> y{ 0 };
int a;
int b;
void thread1()
{
//atomic op A
x.store(1, std::memory_order_relaxed);
//fence X
std::atomic_thread_fence(std::memory_order_seq_cst);
//sequenced-before P, thus in SC order X=>P
//atomic op P
a = y.load(std::memory_order_seq_cst);//0
//reads-before(from-read) Q, thus in SC order P=>Q
}
void thread2()
{
//atomic op Q
y.store(1, std::memory_order_seq_cst);
//sequenced-before B, thus in SC order Q=>B
//atomic op B
b = x.load(std::memory_order_seq_cst);
}
int main()
{
std::thread t2(thread2);
std::thread t1(thread1);
t1.join();
t2.join();
assert(a == 1 || b == 1);//true?
return 0;
}
The question is: is the assertion a == 1 || b == 1
always true in C++20?
I think it's true in C++17. The idea is this: first assume a get 0, then prove b must get 1. Let's split it into two parts:
C++17 has this in the standard:
For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there is a memory_order::seq_cst fence X such that A is sequenced before X and B follows X in S, then B observes either the effects of A or a later modification of M in its modification order.
It's exactly the case here. Op A and B are on atomic object x, where A stores 1 in x and B reads x's value; and there is the seq_cst fence X; A is sequenced before X and B follows X in S(the above); and there's no modification of M later than A. So B observes the value written by A. So b gets 1.
Is the above reasoning correct?
But in C++20, the above sited text about fences is removed (by P0668), and upgraded to "strengthened" seq_cst fences, which reads:
An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if
- A is a modification, and B reads the value stored by A, or
- A precedes B in the modification order of M, or
- A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or
- there exists an atomic modification X of M such that A is coherence-ordered before X and X is coherence-ordered before B.
There is a single total order S on all memory_order::seq_cst operations, including fences, that satisfies the following constraints. First, if A and B are memory_order::seq_cst operations and A strongly happens before B, then A precedes B in S. Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:
- if A and B are both memory_order::seq_cst operations, then A precedes B in S; and
- if A is a memory_order::seq_cst operation and B happens before a memory_order::seq_cst fence Y , then A precedes Y in S; and
- if a memory_order::seq_cst fence X happens before A and B is a memory_order::seq_cst operation, then X precedes B in S; and
- if a memory_order::seq_cst fence X happens before A and B happens before a memory_order::seq_- cst fence Y , then X precedes Y in S.
I don't see how a==1 || b==1
is guaranteed any more. It's only about how SC fences are ordered with regard to other SC operations, or operations that are coherence-ordered with each other.
I can't find anything else about how SC fences interact with non-SC atomic operations in the C++20 standard, and a load that does not see a value from a store doesn't mean it's coherence-ordered before the store. (Or does it? If so that would fix the problem; see comments) 31.4 : 3.2 applies when A precedes B in the modification order of M, but reads aren't part of the modification order of an object, are they?
Is it I didn't look into the standard hard enough, or the code simply doesn't work any more in C++20? If it's the later, is the regression intended? (In P0668, they claim to "strengthen" fences).
Upvotes: 9
Views: 633
Reputation: 58132
Yes, I think we can prove that a == 1 || b == 1
is always true. Most of the ideas here were worked out in comments by zwhconst and Peter Cordes, so I just thought I would write it up as an exercise.
(Note that X, Y, A, B below are used as the dummy variables in the standard's axioms, and may change from line to line. They do not coincide with the labels in your code.)
Suppose b = x.load()
in thread2 yields 0.
We do have the coherence ordering that you asked about. Specifically, if b = x.load
yields 0, then I claim that x.load()
in thread2 is coherence ordered before x.store(1)
in thread1, thanks to the third bullet in the definition of coherence ordering. For let A be x.load()
, B be x.store(1)
, and X be the initialization x{0}
(see below for quibble). Clearly X precedes B in the modification order of x
, since X happens-before B (synchronization occurs when the thread is started), and if b == 0
then A has read the value stored by X.
(There is possibly a gap here: initialization of an atomic object is not an atomic operation (3.18.1p3), so as worded, the coherence ordering does not apply to it. I have to believe it was intended to apply here, though. Anyway, we could dodge the issue by putting x.store(0, std::memory_order_relaxed);
in main
before starting the threads, which would still address the spirit of your question.)
Now in the definition of the ordering S, apply the second bullet with A = x.load()
and B = x.store(1)
as before, and Y being the atomic_thread_fence
in thread1. Then A is coherence-ordered before B, as we just showed; A is seq_cst
; and B happens-before Y by sequencing. So therefore A = x.load()
precedes Y = fence
in the order S.
Now suppose a = y.load()
in thread1 also yields 0.
By a similar argument to before, y.load()
is coherence ordered before y.store(1)
, and they are both seq_cst
, so y.load()
precedes y.store(1)
in S. Also, y.store(1)
precedes x.load()
in S by sequencing, and likewise atomic_thread_fence
precedes y.load()
in S. We therefore have in S:
x.load
precedes fence
precedes y.load
precedes y.store
precedes x.load
which is a cycle, contradicting the strict ordering of S.
Upvotes: 4