c++language-lawyerc++20stdatomicmemory-model

Reputation: 7369

How does seq_cst order formally guarantee the result in an IRIW litmus test?

Consider this example from cppreference

#include <thread>
#include <atomic>
#include <cassert>
 
std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};
 
void write_x()
{
    x.store(true, std::memory_order_seq_cst);  // #1
}
 
void write_y()
{
    y.store(true, std::memory_order_seq_cst); // #2
}
 
void read_x_then_y()
{
    while (!x.load(std::memory_order_seq_cst))  // #3
        ;
    if (y.load(std::memory_order_seq_cst)) {  // #4
        ++z;
    }
}
 
void read_y_then_x()
{
    while (!y.load(std::memory_order_seq_cst))  // #5
        ;
    if (x.load(std::memory_order_seq_cst)) {  // #6
        ++z;
    }
}
 
int main()
{
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);
    a.join(); b.join(); c.join(); d.join();
    assert(z.load() != 0);  // will never happen
}

In order to prove the assert will never happen, we should prove either y-store happens before y-load-in-if, or x-store happens before x-load-in-if, or both. They will guarantee the value of z should be at least greater than 0, which makes the assert never happen, as per the below rules.

The value of an atomic object M, as determined by evaluation B, shall be the value stored by some side effect A that modifies M, where B does not happen before A.

If an operation A that modifies an atomic object M happens before an operation B that modifies M, then A shall be earlier than B in the modification order of M.

If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either be the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M.

If a value computation A of an atomic object M happens before an operation B that modifies M, then A shall take its value from a side effect X on M, where X precedes B in the modification order of M.

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M.

However, we can only have these conditions

1. x-store synchronizes x-load-in-while
2. y-store synchronizes y-load-in-while
3. x-load-while is sequenced before y-load-in-if
4. y-load-while is sequenced before x-load-in-if

According to [intro.races] p10, bullet 1 and 3 give:

x-store inter-thread happens before y-load-in-if

bullet 2 and 4 give:

y-store inter-thread happens before x-load-in-if

Then, I cannot find any association that can prove y-store happens before y-load-in-if, or x-store happens before x-load-in-if. The special for this case is they are all seq operations, so there is also an extra rule, called single total order S. Even with that rule, I can only conclude the following things:

1. x-store < x-load-in-while < y-load-in-if, in S
2. y-store < y-load-in-while < x-load-in-if, in S

The symbol < means precedes. Apart from these conclusions, I cannot get more information that can interpret why the assertion will never happen. How to correctly use the current C++ standard to interpret this case?

Upvotes: 1

Answers (3)

duck

Reputation: 2448

Both ++z (at #4 and #6) happen before z.load() in main ([thread.thread.member]/4, [intro.races]/9, /10.2), if they happen at all.

These side effects are necessarily visible to the load ([intro.races]/18), so the only way for the assert to fire is for both #4 and #6 to read false.

In read_x_then_y, #3 reads true written by #1, meaning #1 is coherence-ordered before #3 ([atomics.order]/3.1), and #3 strongly happens before the load of y at #4 ([intro.races]/12.1).

#4 reading false would make it coherence-ordered before the write of true at #1 ([atomics.order]/3.3, where X is the initial 'write' of false).

Few points of clarification here:

The first part of [atomics.order]/3.3 simply states that A and B must be different atomic operations, it does not limit the scope of this bullet to just read-modify-write ones. P0668, the paper that introduced the wording, summarizes:

The third bullet corresponds to what's often called "reads before": A reads a write earlier than B.
Initialization of an object not technically being a side effect is the subject of CWG2587, and is not unique to atomics. I think it is clear enough that the intent here is that these rules include initialization when talking about side effects.
(Constant) initialization of y happening before #2 is another point that the standard (by my reading, at least) fails to explicitly specify, but we can once again apply common sense: if it doesn't happen before #2, this program has undefined behavior because it accesses an out-of-lifetime object ([res.on.objects]/2), and any further analysis is meaningless. That would of course be absurd, so we'll assume that #2 does in fact happen after y is initialized, and as such it follows the initialization in the modification order of y ([intro.races]/15).

This analysis gives us the ordering #1 < #3 < #4 < #2 in the total order S of seq_cst operations ([atomics.order]/4) on an execution where #4 in read_x_then_y reads false.

Repeating the same for read_y_then_x (once again assuming #6 gets false) yields #2 < #5 < #6 < #1.

In the first case, #1 precedes #2 in S, in the second case, #2 precedes #1. Clearly, there's no possible S that is consistent with both of these, meaning #4 and #6 can't both read false on any possible execution of this program. Quoting P0668 again:

If these constraints are not satisfiable by any total order S, then the candidate execution which gave rise to the coherence order is not valid.

Upvotes: 5

вася

Reputation: 11

According to this:

³ An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if
^(3.1) — A is a modification, and B reads the value stored by A, or
^(3.2) — A precedes B in the modification order of M, or
^(3.3) — A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or
^(3.4) — there exists an atomic modification X of M such that A is coherence-ordered before X and X is coherence-ordered before B.

we need the following coherence-ordered before relations if we want x-load-in-if and y-load-in-if to return false:

for x: x-load-in-if -> x-store -> x-load-in-while
for y: y-load-in-if -> y-store -> y-load-in-while

According to this:

¹² An evaluation A strongly happens before an evaluation D if, either
^(12.1) — A is sequenced before D, or
^(12.2) — A synchronizes with D, and both A and D are sequentially consistent atomic operations ([atomics.order]), or
^(12.3) — there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or
^(12.4) — there is an evaluation B such that A strongly happens before B, and B strongly happens before D.
[Note 11: Informally, if A strongly happens before B, then A appears to be evaluated before B in all contexts. Strongly happens before excludes consume operations. — end note]

these are the strongly happens before relations for our example:

x-store -> x-load-in-while -> y-load-in-if
y-store -> y-load-in-while -> x-load-in-if

According to this:

⁴ There is a single total order S on all memory_order::seq_cst operations, including fences, that satisfies the following constraints. First, if A and B are memory_order::seq_cst operations and A strongly happens before B, then A precedes B in S. Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:
^(4.1) — if A and B are both memory_order::seq_cst operations, then A precedes B in S; and
...

we need to put all the operations in a single total order S, while preserving both:

coherence-ordered before relations:
- x-load-in-if -> x-store -> x-load-in-while
- y-load-in-if -> y-store -> y-load-in-while
and strongly happens before relations:
- x-store -> x-load-in-while -> y-load-in-if
- y-store -> y-load-in-while -> x-load-in-if

Let's first look at strongly happens before relations:

no matter how exactly you put the operations in a single total order S, either y-load-in-if or x-load-in-if will be the last operation
if x-load-in-if will be the last operation, then x-store -> x-load-in-if in S, but according to coherence-ordered before it should be x-load-in-if -> x-store in S => contradiction.
The same logic works when y-load-in-if is the last operation in S.

As a result, executions where both x-load-in-if and y-load-in-if return false (and z=0) aren't allowed in C++.

Upvotes: 1

Peter Cordes

Reputation: 365557

Consider each of the two possibilities, x-store < y-store in S, or y-store < x-store in S. One or the other of those must be true, because both stores are part of the total order of all SC operations.

For each case, one of the read functions will provably increment z due to the order of the loads in S. (And the other function might also happen to increment; the second load could happen after both stores have become visible.)

Syncs-with is irrelevant, there are no earlier operations in write_x or write_y for us to see.

Informally, the existence of a single total order of all atomic operations (even on different objects) is what guarantees that all threads agree on the order of two independent stores; this is an IRIW litmus test. (Independent Readers, Independent Writers).

A C++ program that's data-race free and uses seq_cst for all its atomic operations behaves in a way that's explainable as some interleaving of source order. i.e. sequentially consistent execution of data-race free programs, SC-DRF.

Fun fact: in CPU memory models, only the weakest models actually allow IRIW "reordering" and real hardware that actually does it (regardless of what's allowed on paper) is even rarer. e.g. ARMv8 disallows it but otherwise is quite weak; ARMv7 allowed it but no ARM hardware ever did it. Having all cores agree on a store order is vastly weaker than sequential consistency, and happens automatically if store data only becomes visible to other cores via coherent cache.

But in C++, only seq_cst forbids it; if you'd used acq_rel (or release for the stores and acquire for the loads), z == 0 would be allowed.

Upvotes: 1

How does seq_cst order formally guarantee the result in an IRIW litmus test?

Answers (3)

Related Questions