C++ std::memory_order_relaxed and skip/stop flag

Question

Is it ok to use std::memory_order_relaxed for skip flag, like in iterate:

    constexpr static const std::size_t capacity = 128; 
    std::atomic aliveness[capacity];
    T data[capacity];     // immutable/atomic/etc.

    template
    void iterate(Closure&& closure){
        for(std::size_t i = 0; i



Or I should use release/acquire, instead?

aliveness[i] may become "alive" again.

iterate and erase called from multiple threads simuteniously. Consider data immutable/atomic/synchronized under some other external lock.

Peter Cordes · Accepted Answer

Assumption: other thread(s) can be running iterate() at any time while you're using erase. (An early version of the question didn't specify immutable. This answer is still relevant if the locking (or lack thereof) for updating data[i] isn't ordered wrt. writes to alive[i].)

If data is truly immutable, then mo_relaxed is definitely fine, unless you need the global visibility of those stores to be ordered with respect to something else the thread is doing. mo_relaxed stores will always eventually become visible to other threads (and on current CPUs, will do so very quickly).

If you're going to modify the non-atomic data[i] while alive[i] is false, you need to make sure other threads aren't using its value while it's being modified. That would be UB in C++, and an actual correctness problem on real hardware depending on T and closure.

Acquire semantics will work for iterate. Access to data[i] logically happens after alive[i], because the one-way barrier is in the correct direction. The acquire-load won't reorder with later loads or stores, only earlier ones.

But the store in erase is a problem. It needs to be globally visible before any modification of data[i]. But release-stores are allowed to reorder with later stores. What you need is a release fence to block reordering of stores in both directions.

void erase(std::size_t index){
    aliveness[index].store(false, std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_release);
  // or don't do that, this probably wouldn't be enough synchronization
  // and whatever you do that syncs the data might make this unnecessary.
}

If T was an atomic type, a release-store to data[i] would do the trick. But don't do that; that will suck if T is too large to be lock-free atomic.

(Update: I'm not sure this is fully valid. The ISO C++ memory ordering rules are defined in terms of synchronizes-with / happens-before, and what values a given load is allowed to see. Not in terms of reordering per-se between atomic and non-atomic operations.

Also, this is like what you'd do for a SeqLock, but that also depends on data-race UB for reads. If another thread is going to read data[i], if it checks alive[i] then reads, there's a TOCTOU race: our write to alive could happen after that read, and then our write to data[i] could happen simultaneously with another thread reading. So this is probably insufficient to erase / modify / un-erase an element, even on a strongly-ordered non-weird machine like x86.)

On some implementations, a seq-cst store would also work, but I think only as an implementation detail. It usually results in a store + a full-barrier asm instruction. (e.g. x86 MFENCE). So it works only because compilers implement seq-cst stores as store + thread_fence(seq_cst). And that's not the case on AArch64 where STLR is just a release operation; only the interaction with LDAR gives SC-DRF. (seq_cst for data-race-free programs).

Note that iterate isn't safe if closure modifies data[], unless only one thread can call it at a time. In which case, what's the point of this? So probably you should use

void iterate(Closure&& closure) const
{ ... }

so iterate only works on const objects of your container.

C++ std::memory_order_relaxed and skip/stop flag

Answers (1)

Related Questions