How locking is implemented?

Question

i have following code:

 while(lock)
      ;   
 lock = 1;
 // critical section
 lock = 0;

As reading or changing lock value is in itself a multi-instruction

read lock
change value
write it

If it happens like:

1) One thread reads the lock and stops there
2) Another thread reads it and sees it is free; lock it and do something untill half
3) First thread wakes up and goes into CS

SO how would locking would be implmented in system ? Placing variables over top of another variables is not right : it would be like Guarding the guard ?

Stopping other processors threads is also not right ?

David Schwartz · Accepted Answer

It is 100% platform specific. Generally, the CPU provides some form of atomic operation such as exchange or compare and swap. A typical lock might work like this:

1) Create: Store 0 (unlocked) in the variable.

2) Lock: Atomically attempt to switch the value of the variable from 0 (unlocked) to 1 (locked). If we failed (because it wasn't unlocked to begin with), let the CPU rest a bit, and then retry. Use a memory barrier to ensure no future memory operations sneak behind this one.

3) Unlock: Use a memory barrier to ensure previous memory operations don't sneak past this one. Atomically write 0 (unlocked) to the variable.

Note that you really don't need to understand this unless you want to design your own synchronization primitives. And if you want to do that, you need to understand an awful lot more. It's certainly a good idea for every programmer to have a general idea of what he's making the hardware do. But this is an area filled with seriously heavy wizardry. There are so many, many ways this can go horribly wrong. So just use the locking primitives provided by the geniuses who made your platform, compiler, and threading library. Here be dragons.

For example, SMP Pentium Pro systems have an erratum that requires special handling in the unlock operation. A naive implementation of the lock algorithm will cause the branch prediction logic to expect the operation to keep spinning, incurring a massive performance penalty at the worst possible time -- when you first acquire the lock. A naive implementation of the lock algorithm may cause two cores each waiting for the same lock to saturate the bus, slowing the CPU that needs to get work done in order to release the lock to a crawl. These all require heavy wizardry and deep understanding of the hardware to deal with.

How locking is implemented?

Answers (2)

Related Questions