Reputation: 119
While reading the source codes of Go, I have a question about the code in src/sync/once.go:
func (o *Once) Do(f func()) {
// Note: Here is an incorrect implementation of Do:
//
// if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
// f()
// }
//
// Do guarantees that when it returns, f has finished.
// This implementation would not implement that guarantee:
// given two simultaneous calls, the winner of the cas would
// call f, and the second would return immediately, without
// waiting for the first's call to f to complete.
// This is why the slow path falls back to a mutex, and why
// the atomic.StoreUint32 must be delayed until after f returns.
if atomic.LoadUint32(&o.done) == 0 {
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done == 0 {
defer atomic.StoreUint32(&o.done, 1)
f()
}
}
Why is atomic.StoreUint32
used, rather than, say o.done = 1
? Are these not equivalent? What are the differences?
Must we use the atomic operation (atomic.StoreUint32
) to make sure that other goroutines can observe the effect of f()
before o.done
is set to 1 on a machine with weak memory model?
Upvotes: 10
Views: 1010
Reputation: 630
Must we use the atomic operation (
atomic.StoreUint32
) to make sure that other goroutines can observe the effect off()
beforeo.done
is set to 1 on a machine with weak memory model?
Yes you are in the right direction of thought, but please note that even if the targeting machine has a strong memory model, the Go compiler can and will reorder instructions as long as the result adheres to the Go memory model. In contrast, even if the machine memory model is weaker than the language one, the compiler has to emit additional barriers so that the final code behaves compliantly with the language specification.
Let's consider the implementation of sync.Once
without sync/atomic
, with modifications for easier explaining:
func (o *Once) Do(f func()) {
if o.done == 0 { // (1)
o.m.Lock() // (2)
defer o.m.Unlock() // (3)
if o.done == 0 { // (4)
f() // (5)
o.done = 1 // (6)
}
}
}
If a goroutine observes that o.done != 0
, it will return, as a result, the function must ensure that f()
happens before any read can observe a 1 from o.done
.
f
and set o.done
to 1.As a result, the write (6) must have release semantics, as well as the read (1) having acquire semantics. Since Go does not support acquire-read and release-store, we must resort to the stronger order, which is sequential consistency, provided by atomic.(Load/Store)Uint32
.
Final note: since accesses to memory locations not larger than a machine word are guaranteed to be atomic, this usage of atomic
here has nothing to do with atomicity and everything to do with synchronisation.
Upvotes: 1
Reputation: 109417
Remember, unless you are writing the assembly by hand, you are not programming to your machine's memory model, you are programming to Go's memory model. This means that even if primitive assignments are atomic with your architecture, Go requires the use of the atomic package to ensure correctness across all supported architectures.
Access to the done
flag outside of the mutex only needs to be safe, not strictly ordered, so atomic operations can be used instead of always obtaining a lock with a mutex. This is an optimization to make the fast path as efficient as possible, allowing sync.Once
to be used in hot paths.
The mutex used for doSlow
is for mutual exclusion within that function alone, to ensure that only one caller ever makes it to f()
before the done
flag is set. The flag is written using atomic.StoreUint32
, because it may happen concurrently with atomic.LoadUint32
outside of the critical section protected by the mutex.
Reading the done
field concurrently with writes, even atomic writes, is a data race. Just because the field is read atomically, does not mean you can use normal assignment to write it, hence the flag is checked first with atomic.LoadUint32
and written with atomic.StoreUint32
The direct read of done
within doSlow
is safe, because it is protected from concurrent writes by the mutex. Reading the value concurrently with atomic.LoadUint32
is safe because both are read operations.
Upvotes: 5
Reputation: 1
Atomic operations can be used to synchronize the execution of different goroutines.
Without synchronization, even if a goroutine observes o.done == 1, there is no guarantee that it will observe the effect of f()
.
Upvotes: -1
Reputation: 319
func (o *Once) Do(f func()) {
if atomic.LoadUint32(&o.done) == 0 { # 1
// Outlined slow-path to allow inlining of the fast-path.
o.doSlow(f)
}
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done == 0 { # 2
defer atomic.StoreUint32(&o.done, 1) # 3
f()
}
}
Upvotes: 0