Reputation: 27375
I'm trying to understand the instruction reordering by the following simple example:
int a;
int b;
void foo(){
a = 1;
b = 1;
}
void bar(){
while(b == 0) continue;
assert(a == 1);
}
It's known that in this example the assertion may fail if one thread executes foo
, and another one executes bar
. But I don't understand why. I consulted the Intel manual Vol. 3A, 8.2.2 and found the following:
Writes to memory are not reordered with other writes, with the following exceptions:
— streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and
— string operations (see Section 8.2.4.1).
There are no string operations here as well as I did not notice NT
move instructions. So... Why is the reordering of writes possible?
Or does the memory matters in
Writes to memory are not reordered
? So when we have a
and b
cached and writes occur not to main memory, but to cache they can be.
Upvotes: 3
Views: 561
Reputation: 363999
Your premise is wrong. Only compile-time reordering can break this example on x861.
x86 asm stores are release-stores. They can only commit from the store buffer to L1d cache in program order.
a
can't still be in shared state after b=1
is visible; that would mean the thread running foo
let its stores commit out of order. This is what Writes to memory are not reordered with other writes means, for stores to cacheable memory.
If it's in shared state again after being invalidated by the RFO from the thread running foo
, then it will have the updated value of a
.
Footnote 1. Of course the spin-loop will optimize into if (b==0) infinite_loop
, because data-race UB lets the compiler hoist the load. See MCU programming - C++ O2 optimization breaks while loop.
You seem to be asking about C rules while assuming that the code will be translated naively / directly to x86 asm. You could get that with relaxed atomics, but not volatile
because volatile
accesses can't be reordered (at compile time) with other volatile
accesses.
Upvotes: 3
Reputation: 234635
If one thread was running foo
and another was running bar
then the behaviour of your program would be undefined.
You are not allowed to make simultaneous read and writes on a non-atomic variable such as int
.
So instruction reording is permissible in this instance.
Upvotes: 4