The Volatile Keyword and CPU Cache Coherence Protocol

Question

The CPU has already guranteed the cache conherence by some protocols (like MESI). Why do we also need volatile in some languages(like java) to keep the visibility between multithreads.

The likely reason is those protocols aren't enabled when boot and must be triggered by some instructions like LOCK.

If really that, Why does not the CPU enable the protocol when boot?

pveentjer · Accepted Answer

Volatile prevents 3 different flavors of problems:

visibility
reordering
atomicity

I'm assuming X86..

First of all, caches on the X86 are always coherent. So it won't happen that after one CPU commits the store to some variable to the cache, another CPU will still load the old value for that variable. This is the domain of the MESI protocol.

Assuming that every put and get in the Java bytecode is translated (and not optimized away) to a store and a load on the CPU, then even without volatile, every get would see the most recent put to the same address.

The issue here is that the compiler (JIT in this case) has a lot of freedom to optimize code. For example if it detects that the same field is read in a loop, it could decide to hoist that variable out of the loop as is shown below.

 for(...){
       int tmp = a;
       println(tmp);
 }

After hoisting:

 int tmp = a;
 for(...){
       println(tmp);
 }

This is fine if that field is only touched by 1 thread. But if the field is updated by another thread, the first thread will never see the change. Using volatile prevents such visibility problems and this is effectively the behavior of:

C style volatile
the Java volatile before the Java memory model was introduced with JSR-133.
A VarHandle with opaque access mode.

Then there is another very important aspect of volatile; volatile prevents that loads and stores to different addresses in the instruction stream executed by some CPU are reordered. The JIT compiler and the CPU have a lot of liberty to reorder loads and stores. Although on the X86 only older stores can be reordered with newer loads to a different address due to store buffers.

So imagine the following code:

int a;
volatile int b;

thread1:
    a=1;
    b=1;

thread2:
    if(b==1) print(a);

The fact that b is volatile prevents the store of a=1 to jump after the store b=1. And it also prevents the load of a to jump in before the load of b. So this way thread 2 is guaranteed to see a=1, when it reads b=1.

So using volatile, you can ensure that non volatile fields are visible to other threads.

If you want to understand how volatile works, I would suggest digging into the Java memory model which is expressed in synchronize-with and happens-before rules as Margeret Bloom already indicated. I have given some low level details, but in case of Java, it is best to work with this high level model instead of thinking in terms of hardware. Thinking exclusively in terms of hardware/fences is only for the experts, non portable and very fragile.

The Volatile Keyword and CPU Cache Coherence Protocol

Answers (1)

Related Questions