Reputation: 721
According to an article in Jeff Preshing's blog:
A release fence prevents the memory reordering of any read or write which precedes it in program order with any write which follows it in program order.
He also has a great post explaining the differences between release fences and release operations here.
Despite the clear explanations in these blog posts, I'm still confused about how to interpret a release fence call such as
std::atomic_thread_fence(std:memory_order_release);
in terms of memory operation reordering vs the potential fencing mechanisms which provide the guarantees of a release fence.
Is it that compiler must guarantee to circumvent the possibility of a later write call in the thread preceding the fence statement call when compiling to machine code and the CPU must guarantee the same when processing it?
In other words, when exactly does the fence call go from being a statement to being a guarantee?
And most importantly, is there any chance that either compiler or CPU reordering could reorder write operations which succeed the program order of the fence statement, to precede the fence at execution time during that process?
Upvotes: 2
Views: 179
Reputation: 1702
Is it that compiler must guarantee to circumvent the possibility of a later write call in the thread preceding the fence statement call when compiling to machine code and the CPU must guarantee the same when processing it?
Certain compiler optimizations must be disabled. The compiler must emit code that prevents certain CPU optimizations, including the necessarily CPU fence instructions. That's what makes it a guarantee...
is there any chance that either compiler or CPU reordering could reorder write operations which succeed the program order of the fence statement, to precede the fence at execution time during that process?
An std::atomic_thread_fence
with std::memory_order_release
prevents loads and stores before the fence ("before" in program order) from being reordered with any store after the fence (subsequent loads can be reordered before). At execution time there might not be needed an actual memory barrier instruction per se, as long as the guarantee holds.
An std::atomic_thread_fence
with std::memory_order_acquire
prevents loads and stores after the fence ("after" in program order) from being reordered with any load before the fence (earlier stores can be reordered after). At execution time there might not be needed an actual memory barrier instruction per se, as long as the guarantee holds.
Note this is stricter than std::atomic::store, and std::atomic::load, respectively.
An std::atomic<T>::store
with std::memory_order_release
prevents loads and stores before from being reordered with just that particular store. Subsequent loads and stores can be reordered before. This is a traditional 1-way release, theoretically. (In practice, a more heavy-handed synchronization than is strictly needed might be used.)
An std::atomic<T>::load
with std::memory_order_acquire
prevents loads and stores after from being reordered with just that particular load. Earlier loads and stores can be reordered after. This is a traditional 1-way acquire, theoretically.
Upvotes: 1