Reputation: 1420
I read the spinlock function code in the linux kernel. There are two functions related to spinlock. See the code below:
static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
short inc = 0x0100;
asm volatile (
LOCK_PREFIX "xaddw %w0, %1\n"
"1:\t"
"cmpb %h0, %b0\n\t"
"je 2f\n\t"
"rep ; nop\n\t"
"movb %1, %b0\n\t"
/* don't need lfence here, because loads are in-order */
"jmp 1b\n"
"2:"
: "+Q" (inc), "+m" (lock->slock)
:
: "memory", "cc");
}
static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
int inc = 0x00010000;
int tmp;
asm volatile(LOCK_PREFIX "xaddl %0, %1\n"
"movzwl %w0, %2\n\t"
"shrl $16, %0\n\t"
"1:\t"
"cmpl %0, %2\n\t"
"je 2f\n\t"
"rep ; nop\n\t"
"movzwl %1, %2\n\t"
/* don't need lfence here, because loads are in-order */
"jmp 1b\n"
"2:"
: "+r" (inc), "+m" (lock->slock), "=&r" (tmp)
:
: "memory", "cc");
}
I have two question:
1.What's the difference between the two functions above?
2.What can I do to monitor the spinlock waiting time(the time it takes to first try the lock and finally get the lock)?does the variable inc means the spinlock waiting time?
Upvotes: 3
Views: 1797
Reputation: 477070
Let me first explain how the spinlock code works. We have variables
uint16_t inc = 0x0100,
lock->slock; // I'll just call this "slock"
In the assembler code, inc
is referred to as %0
and slock
as %1
. Moreover, %b0
denotes the lower 8 bit, i.e. inc % 0x100
, and %h0
is inc / 0x100
.
Now:
lock xaddw %w0, %1 ;; "inc := slock" and "slock := inc + slock"
;; simultaneously (atomic exchange and increment)
1:
cmpb %h0, %b0 ;; "if (inc / 256 == inc % 256)"
je 2f ;; " goto 2;"
rep ; nop ;; "yield();"
movb %1, %b0 ;; "inc = slock;"
jmp 1b ;; "goto 1;"
2:
Comparing the upper and lower byte of inc
succeeds if inc
is zero. Since inc
has the value of the original lock, this happens if the lock is unlocked. In that case, the lock will already have been incremented to non-zero by the atomic exchange-and-increment, so it is now locked.
Otherwise, i.e. if the lock had already been locked, we pause a little, then update inc
to the current value of the lock, and try again.
(I believe there's actually a possiblity for an overflow, if 28 threads simultaneously attempt to get the spinlock. In that case, slock
is updated to 0x0100, 0x0200, ... 0xFF00, 0x0000, and would then appear to be unlocked. Maybe that's why the second version of the code uses a 16-bit wide counter, which would require 216 simultaneous attempts.)
Now let's insert a counter:
uint32_t spincounter = 0;
asm volatile( /* code below */
: "+Q" (inc), "+m" (lock->slock)
: "=r" (spincounter)
: "memory", "cc");
Now spincounter
may be referred to as %2
. We just need to increment the counter each time:
1:
inc %2
cmpb %h0, %b0
;; etc etc
I haven't tested this, but that's the general idea.
Upvotes: 2