Ben Hymers
Ben Hymers

Reputation: 26546

How does a mutex ensure a variable's value is consistent across cores?

If I have a single int which I want to write to from one thread and read from on another, I need to use std::atomic, to ensure that its value is consistent across cores, regardless of whether or not the instructions that read from and write to it are conceptually atomic. If I don't, it may be that the reading core has an old value in its cache, and will not see the new value. This makes sense to me.

If I have some complex data type that cannot be read/written to atomically, I need to guard access to it using some synchronisation primitive, such as std::mutex. This will prevent the object getting into (or being read from) an inconsistent state. This makes sense to me.

What doesn't make sense to me is how mutexes help with the caching problem that atomics solve. They seem to exist solely to prevent concurrent access to some resource, but not to propagate any values contained within that resource to other cores' caches. Is there some part of their semantics I've missed which deals with this?

Upvotes: 22

Views: 2984

Answers (3)

Mike Vine
Mike Vine

Reputation: 9837

The right answer to this is magic pixies - e.g. It Just Works. The implementation of std::atomic for each platform must do the right thing.

The right thing is a combination of 3 parts.

Firstly, the compiler needs to know that it can't move instructions across boundaries [in fact it can in some cases, but assume that it doesn't].

Secondly, the cache/memory subsystem needs to know - this is generally done using memory barriers, although x86/x64 generally have such strong memory guarantees that this isn't necessary in the vast majority of cases (which is a big shame as its nice for wrong code to actually go wrong).

Finally the CPU needs to know it cannot reorder instructions. Modern CPUs are massively aggressive at reordering operations and making sure in the single threaded case that this is unnoticeable. They may need more hints that this cannot happen in certain places.

For most CPUs part 2 and 3 come down to the same thing - a memory barrier implies both. Part 1 is totally inside the compiler, and is down to the compiler writers to get right.

See Herb Sutters talk 'Atomic Weapons' for a lot more interesting info.

Upvotes: 6

Timo
Timo

Reputation: 5176

Most modern multicore processors (including x86 and x64) are cache coherent. If two cores hold the same memory location in cache and one of them updates the value, the change is automatically propagated to other cores' caches. It's inefficient (writing to the same cache line at the same time from two cores is really slow) but without cache coherence it would be very difficult to write multithreaded software.

And like syam said, memory barriers are also required. They prevent the compiler or processor from reordering memory accesses, and also force the write into memory (or at least into cache), when for example a variable is held in a register because of compiler optizations.

Upvotes: 5

syam
syam

Reputation: 15069

The consistency across cores is ensured by memory barriers (which also prevents instructions reordering). When you use std::atomic, not only do you access the data atomically, but the compiler (and library) also insert the relevant memory barriers.

Mutexes work the same way: the mutex implementations (eg. pthreads or WinAPI or what not) internally also insert memory barriers.

Upvotes: 6

Related Questions