Why is memory_order_relaxed not okay for a once_flag-like atomic_bool?

Question

On slide 54 of Herb Sutter's talk on "atomic<> Weapons" (which unfortunately he didn't have time to present in the video), he suggests the following code is incorrect:

struct widget {
  static inline atomic instance = nullptr;
  static inline atomic create = false;

  static widget *get_instance()
  {
    if (instance.load(memory_order_acquire) == nullptr) {
      if (!create.exchange(true, memory_order_relaxed)) // <- BAD?
        instance.store(new widget(), memory_order_release);
      else
        while (instance.load(memory_order_acquire) == nullptr)
          ;
    }
    return instance.load(memory_order_acquire);
  }
};

Supposedly the line labeled // <- BAD? should be:

      if (!create.exchange(true, memory_order_acquire)) // <- GOOD?

My question: what can go wrong with the BAD line (even on contrived pathological architectures that do not exist)? Equivalently, what's wrong with the following argument for why the BAD line is actually okay?

If one or more threads call get_instance(), then exactly one thread will see false from create.exchange(...), so exactly one widget will be allocated and exactly one thread will ever store to instance.
No data races occur from access to the variables instance and create, since these are atomic regardless of memory order.
Because of write-write coherence, it is impossible for get_instance() to return nullptr or anything other than the result of the single execution of new widget().
Because the only non-nullptr store to instance uses release semantics, and all loads from instance use acquire semantics, all non-nullptr loads will synchronize with the store, meaning anything done in the widget constructor will happen before any accesses to the instance after get_instance() returns, so also no undefined behavior from accessing non-atomic contents of widget.

Why is memory_order_relaxed not okay for a once_flag-like atomic_bool?

Answers (0)

Related Questions