woder
woder

Reputation: 805

why c++ singleton need memory_order_acquire

Singleton* Singleton::getInstance() {
     Singleton* tmp = m_instance.load(std::memory_order_relaxed);
     std::atomic_thread_fence(std::memory_order_acquire);        //<--1
     if (tmp == nullptr) {
         std::lock_guard<std::mutex> lock(m_mutex);
         tmp = m_instance.load(std::memory_order_relaxed);
         if (tmp == nullptr) {
             tmp = new Singleton;
             assert(tmp != nullptr);    
             std::atomic_thread_fence(std::memory_order_release); //<--2
             m_instance.store(tmp, std::memory_order_relaxed);
         }
     }
     return tmp;
 }

here is a common c++ singleton implementation, there are a release fence in 2(marked as above), it is easy to understand, it prevents from reordering new Singleton, without this fence, another thread might get an instance without executing construction yet;

what confuses me is that the acquire fence in 1, release fence promises that Singleton construction has been executed then store to m_instance, here, when we fetch instance, we won't get an instance without executing construction, why do we still need a acquire fence in 1?

And, can we replace atomic_thread_fence with m_instance operation memroy order, are they same? (show as below)

Singleton* Singleton::getInstance() {
     Singleton* tmp = m_instance.load(std::memory_order_acquire);
     if (tmp == nullptr) {
         std::lock_guard<std::mutex> lock(m_mutex);
         tmp = m_instance.load(std::memory_order_relaxed);
         if (tmp == nullptr) {
             tmp = new Singleton;
             assert(tmp != nullptr);    
             m_instance.store(tmp, std::memory_order_release);
         }
     }
     return tmp;
 }

Upvotes: 1

Views: 378

Answers (1)

mpoeter
mpoeter

Reputation: 3001

Yes, the second version with acquire/release order for the operations on m_instance is correct and equivalent to the first version. In fact, it is even preferable to the first version, because fences affect all preceding (acquire)/succeeding (release) atomic operations, but you only need synchronization of operations on m_instance. That's why on some architectures explicit fences are slower.


Why do you need acquire/release in the first place? Because you need a happens-before relation between creation of the singleton and usage of the singleton to avoid data races. Suppose the following:

  • Thread 1 calls Singleton::getInstance() and initializes it; this involves creating a new object and storing the pointer in m_instance
  • Thread 2 calls Singleton::getInstance() and observes the pointer written by Thread 1

Thread 2 will most like dereference the pointer to access the object it points to, and this access is most likely non-atomic. So you have two non-atomic accesses to the object - one during creation and one when using the object. If these are not ordered by a happens-before relation, then this is a data race.

So how do we establish a happens-before relation? By storing the pointer with memory_order_release, and reading it with memory_order_acquire. When an acquire-load operation observes the value written by a release-store, the load synchronizes-with the store, thereby establishing a happens-before relation. Further, construction of the object is sequenced-before the store, and the load is sequenced-before dereferenciation (sequenced-before also implies happens-before), and since happens-before is transitive, it follows that construction happens-before dereferenciation.

For more details on the C++ memory model I recommend this paper which I have co-authored: Memory Models for C/C++ Programmers

Upvotes: 4

Related Questions