Reputation: 2722
I'm reading through the documentation and more specifically
memory_order_acquire: A load operation with this memory order performs the acquire operation on the affected memory location: no reads or writes in the current thread can be reordered before this load. All writes in other threads that release the same atomic variable are visible in the current thread (see Release-Acquire ordering below).
memory_order_release: A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store. All writes in the current thread are visible in other threads that acquire the same atomic variable (see Release-Acquire ordering below) and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic (see Release-Consume ordering below)
These two bits:
from memory_order_acquire
... no reads or writes in the current thread can be re-ordered before this load...
from memory_order_release
... no reads or writes in the current thread can be re-ordererd after this store...
What exactly do they mean?
There's also this example
#include <thread>
#include <atomic>
#include <cassert>
#include <string>
std::atomic<std::string*> ptr;
int data;
void producer()
{
std::string* p = new std::string("Hello");
data = 42;
ptr.store(p, std::memory_order_release);
}
void consumer()
{
std::string* p2;
while (!(p2 = ptr.load(std::memory_order_acquire)))
;
assert(*p2 == "Hello"); // never fires
assert(data == 42); // never fires
}
int main()
{
std::thread t1(producer);
std::thread t2(consumer);
t1.join(); t2.join();
}
But I cannot really figure where the two bits I've quoted apply. I understand what's happening but I don't really see the re-ordering bit because the code is small.
Upvotes: 6
Views: 8253
Reputation: 8579
Acquire and Release are Memory Barriers.
If your program reads data after an acquire barrier you are assured you will be reading data consistent in order with any preceding release by any other thread in respect of the same atomic variable. Atomic variables are guaranteed to have an absolute order (when using memory_order_acquire
and memory_order_release
though weaker operations are provided for) to their reads and writes across all threads. These barriers in effect propagate that order to any threads using that atomic variable.
You can use atomics to indicate something has 'finished' or is 'ready' but if the consumer reads other than the atomic variable the consumer can't rely on 'seeing' the right 'versions' of other memory and atomics would have limited value.
The statements about 'moving before' or 'moving after' are instructions to the optimizer that it shouldn't re-order operations to take place out of order. Optimizers are very good at re-ordering instructions and even omitting redundant reads/writes but if they re-organise the code across the memory barriers they may unwittingly violate that order.
Your code relies on the std::string
object (a) having been constructed in producer()
before ptr
is assigned and (b) the constructed version of that string (i.e. the version of the memory it occupies) being the one that consumer()
reads.
Put simply consumer()
is going to eagerly read the string as soon as it sees ptr
assigned so it damn well better see a valid and fully constructed object or bad times will ensue.
In that code 'the act' of assigning ptr
is how producer()
'tells' consumer
the string is 'ready'. The memory barrier exists to make sure that's what the consumer sees.
Conversely if ptr
was declared as an ordinary std::string *
then the compiler could decide to optimize p
away and assign the allocated address directly to ptr
and only then construct the object and assign the int
data. That is likely a disaster for the consumer
thread which is using that assignment as the indicator that the objects producer
is preparing are ready.
To be accurate if ptr
were a pointer the consumer
may never see the value assigned or on some architectures read a partially assigned value where only some of the bytes have been assigned and it points to a garbage memory location. However those aspects are about it being atomic not the wider memory barriers.
Upvotes: 6
Reputation: 85266
The work done by a thread is not guaranteed to be visible to other threads.
To make data visible between threads, a synchronization mechanism is needed. A non-relaxed atomic
or a mutex
can be used for that. It's called the acquire-release semantics. Writing a mutex "releases" all memory writes before it and reading the same mutex "acquires" those writes.
Here we use ptr
to "release" work done so far (data = 42
) to another thread:
data = 42;
ptr.store(p, std::memory_order_release); // changes ptr from null to not-null
And here we wait for that, and by doing that we synchronize ("acquire") the work done by the producer thread:
while (!ptr.load(std::memory_order_acquire)) // assuming initially ptr is null
;
assert(data == 42);
Note two distinct actions:
In the absence of (2), e.g. when using memory_order_relaxed
, only the atomic
value itself is synchronized. All other work done before/after isn't, e.g. data
won't necessarily contain 42
and there may not be a fully constructed string
instance at the address p
(as seen by the consumer).
For more details about acquire/release semantics and other details of the C++ memory model I would recommend watching Herb's excellent atomic<> weapons talk, it's long but is fun to watch. And for even more details there's a book called "C++ Concurrency in Action".
Upvotes: 9
Reputation: 62531
If you used std::memory_order_relaxed
for the store, the compiler could use the "as-if" rule to move data = 42;
to after the store, and consumer
could see a non-null pointer and indeterminate data
.
If you used std::memory_order_relaxed
for the load, the compiler could use the "as-if" rule to move the assert(data == 42);
to before the load loop.
Both of these are allowed because the value of data
is not related to the value of ptr
If instead ptr
were non-atomic, you'd have a data race and therefore undefined behaviour.
Upvotes: 3