Ashish Negi
Ashish Negi

Reputation: 5301

How locking is implemented?

i have following code:

 while(lock)
      ;   
 lock = 1;
 // critical section
 lock = 0;

As reading or changing lock value is in itself a multi-instruction

read lock
change value
write it

If it happens like:

1) One thread reads the lock and stops there
2) Another thread reads it and sees it is free; lock it and do something untill half
3) First thread wakes up and goes into CS

SO how would locking would be implmented in system ? Placing variables over top of another variables is not right : it would be like Guarding the guard ?

Stopping other processors threads is also not right ?

Upvotes: 1

Views: 613

Answers (2)

Tudor
Tudor

Reputation: 62439

In a course I studied at Uni, a possible firmware solution for implementing locks was presented in the form of the "atomicity bit" associated to a memory operation initiated by a processor.

Basically, when locking, you'll notice that you have a sequence of operations that need to be executed atomically: test the value of the flag and, if not set, set it to locked, otherwise try again. This sequence can be made atomic by associating a bit with each memory request send by the CPU. The first N-1 operations will have the bit set, while the last one will have it unset, to mark the end of the atomic sequence.

When the memory module (there can be several modules) where the flag data is stored receives the request for the first operation in the sequence (whose bit is set), it will serve it and not take requests from any other CPU until the CPU that initiated the atomic sequence sends a request with an unset atomicity bit (since these transactions are usually short, a coarse-grain approach like this is acceptable). Note that this is usually made easier by the assembler providing specialized instructions of type "compare-and-set", that do exactly what I mentioned before.

Upvotes: 0

David Schwartz
David Schwartz

Reputation: 182753

It is 100% platform specific. Generally, the CPU provides some form of atomic operation such as exchange or compare and swap. A typical lock might work like this:

1) Create: Store 0 (unlocked) in the variable.

2) Lock: Atomically attempt to switch the value of the variable from 0 (unlocked) to 1 (locked). If we failed (because it wasn't unlocked to begin with), let the CPU rest a bit, and then retry. Use a memory barrier to ensure no future memory operations sneak behind this one.

3) Unlock: Use a memory barrier to ensure previous memory operations don't sneak past this one. Atomically write 0 (unlocked) to the variable.

Note that you really don't need to understand this unless you want to design your own synchronization primitives. And if you want to do that, you need to understand an awful lot more. It's certainly a good idea for every programmer to have a general idea of what he's making the hardware do. But this is an area filled with seriously heavy wizardry. There are so many, many ways this can go horribly wrong. So just use the locking primitives provided by the geniuses who made your platform, compiler, and threading library. Here be dragons.

For example, SMP Pentium Pro systems have an erratum that requires special handling in the unlock operation. A naive implementation of the lock algorithm will cause the branch prediction logic to expect the operation to keep spinning, incurring a massive performance penalty at the worst possible time -- when you first acquire the lock. A naive implementation of the lock algorithm may cause two cores each waiting for the same lock to saturate the bus, slowing the CPU that needs to get work done in order to release the lock to a crawl. These all require heavy wizardry and deep understanding of the hardware to deal with.

Upvotes: 2

Related Questions