Reputation: 348
Suppose thread 1 is doing atomic stores on a variable v
using memory_order_release
(or any other order) and thread 2 is doing atomic reads on v
using memory_order_relaxed
.
It should be impossible to have partial reads in this case. An example of partial reads would be reading the first half of v
from the latest value and the second half of v
from the old value.
v
without using atomic operations, can we have partial reads in theory?Upvotes: 0
Views: 188
Reputation: 364458
For 1. how do you propose doing that?
atomic<T> v
is a template that overloads the T()
implicit conversion to be like .load(mo_seq_cst)
. That makes tearing impossible. seq_cst atomic is like relaxed plus some ordering guarantees.
The template also overloads operators like ++
to do an atomic .fetch_add(1, mo_seq_cst)
. (Or for pre-increment, 1+fetch_add to produce the already-incremented value).
Of course, if you look at the bytes of the object-representation of atomic<T>
by reading it with non-atomic char*
(e.g. with memcpy(&tmp, &v, sizeof(int))
, that's UB if another thread is modifying it. And yes you can get tearing in practice depending on how you do it.
More likely for objects too large to be lock-free, but possible on some implementations e.g. for 8-byte objects on a 32-bit system which can implement 8-byte atomicity with special instructions, but normally will just use two 32-bit loads.
e.g. 32-bit x86 where an atomic 8-byte load can be done with SSE and then bouncing that back to integer regs. Or lock cmpxchg8b
. Compilers don't do that when they just want two integer registers.
But many 32-bit RISCs that provide atomic 8-byte loads have a double-register load that produces 2 output registers from one instruction. e.g. ARM ldrd
or MIPS ld
. Compilers do use these to optimize aligned 8-byte loads even when atomicity isn't the goal, so you'd probably "get lucky" and not see tearing anyway.
Small objects would typically happen to be atomic anyway; see Why is integer assignment on a naturally aligned variable atomic on x86?
Of course the non-atomic access wouldn't assume that the value could change asynchronously, so a loop could use a stale value indefinitely. Unlike a relaxed atomic, which on current compilers is like volatile
in that it always re-accesses memory. (Via coherent hardware cache of course, just not keeping the value in a register.)
Upvotes: 2