Reputation: 105067
I'm having some trouble understanding what the lifecycle of lock records is when dealing with "thin-locks" on HotSpot.
My understanding is that:
When a thread
T
first attempts to acquire a lock on objecto
, it triggers a "thin lock" creation -- alock record
is created onT
's stack, on the current frameF
, and a copy of themark work
(that will now be referred as adisplaced header
) plus a reference too
is stored onF
. Through a CAS operationo
's header is made to reference the lock record (and the last two bits are set to00
to mark this object as thin-locked!).There are multiple reasons why the CAS operation could fail, though:
- Another thread was quicker to grab the lock, we'll need to turn this thin-lock into a full-blown monitor instead;
- The CAS failed but it can be seen that the reference to the lock record belongs to
T
s stack, so we must be attempting to re-enter the same lock, which is fine. In that case, the lock record of the current stack-frame is kept null.
Given this, I have a couple of questions:
o
?Anyone could shed some light on this?
Let me quote a paragraph from the last link:
Whenever an object is lightweight locked by a monitorenter bytecode, a lock record is either implicitly or explicitly allocated on the stack of the thread performing the lock acquisition operation. The lock record holds the original value of the object’s mark word and also contains metadata necessary to identify which object is locked. During lock acquisition, the mark word is copied into the lock record (such a copy is called a displaced mark word), and an atomic compare-and-swap (CAS) operation is performed to attempt to make the object’s mark word point to the lock record. If the CAS succeeds, the current thread owns the lock. If it fails, because some other thread acquired the lock, a slow path is taken in which the lock is inflated, during which operation an OS mutex and condition variable are associated with the object. During the inflation process, the object’s mark word is updated with a CAS to point to a data structure containing pointers to the mutex and condition variable. During an unlock operation, an attempt is made to CAS the mark word, which should still point to the lock record, with the displaced mark word stored in the lock record. If the CAS succeeds, there was no contention for the monitor and lightweight locking remains in effect. If it fails, the lock was contended while it was held and a slow path is taken to properly release the lock and notify other threads waiting to acquire the lock. Recursive locking is handled in a straightforward fashion. If during lightweight lock acquisition it is determined that the current thread already owns the lock by virtue of the object’s mark word pointing into its stack, a zero is stored into the on-stack lock record rather than the current value of the object’s mark word. If zero is seen in a lock record during an unlock operation, the object is known to be recursively locked by the current thread and no update of the object’s mark word occurs. The number of such lock records implicitly records the monitor recursion count. This is a significant property to the best of our knowledge not attained by most other JVMs.
Thanks
Upvotes: 3
Views: 506
Reputation: 98304
Why would we create a new lock record each time we attempt to enter a lock? Wouldn't it be preferable to just keep a single lock record for each object o?
Seems like you've missed the main point of lock records. Lock record is not some per object entity, but rather per lock site. If, for example, a method has 3 synchronized
blocks, its stack frame may have up to 3 lock records, no matter if it will be 3 different locked objects, or the same object recursively locked 3 times.
Lock records (actually, they are not called so in HotSpot sources; they are usually referred to as a "monitor", "monitor slot", "monitors block", etc.) help to maintain the mapping between a stack frame and its locked monitors. In particular, when a stack frame is removed due to an exception, all locks need to be automatically released. So, think of the monitor slots as something like local variable slots, which can hold references to the same or different objects. Like local variables, monitors are associated with a given stack frame. They hold references to the locked objects, but they are not "locks" themselves.
When leaving a synchronized block, I failed to understand how can the VM know whether we should release the lock or whether we're still "unwinding" from a recursive lock.
A lock record (a monitor slot) holds two things: a reference to the locked object and a so called "displaced header". Displaced header is either a previous (unlocked) value of the object header, or zero, if it was a recursive lock.
As I explained above, if we lock an object 3 times, there will be 3 lock records. Only the first one holds the actual non-zero displaced header, other two will have zeros. This means, first two monitorexit
instructions will pop lock records with zeros, realize that it is a recursive lock, and thus will not update the object. When the last lock record is removed, the JVM sees a non-zero value in the displaced header, and stores it back into the real object header, thus marking it unlocked.
Upvotes: 3
Reputation: 1516
Note that this is my original attempt to answer the question. It's quite clear to me that the docs linked above answer everything on their own.
As for the first question, the new lock record is created on the stack because it's much cheaper than allocating it on the heap. In many cases, a monitor is never contended, and so this can be a huge win. Stack alloc/free can sometimes be so cheap that it's not even worth considering.
The second question can be answered by noticing that the doc refers to the current frame F
. There's always a frame pointer register, and so the monitorexit instruction can simply check if the current frame pointer matches the address of the thin lock record. If so, then it knows it's the last one out.
A key aspect is that the monitorenter/exit instructions must be properly balanced, and the JVM tries to prove this. Otherwise, a ref count would be needed to detect when the last monitorexit instruction was reached. It seems that instead, HotSpot just doesn't bother compiling the code or optimize monitor acquisition if the monitorenter/exit instructions aren't balanced.
Upvotes: 0