How to monitor linux spinlock waiting time?

Question

I read the spinlock function code in the linux kernel. There are two functions related to spinlock. See the code below:

static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
    short inc = 0x0100;

    asm volatile (
        LOCK_PREFIX "xaddw %w0, %1
"
        "1:	"
        "cmpb %h0, %b0
	"
        "je 2f
	"
        "rep ; nop
	"
        "movb %1, %b0
	"
        /* don't need lfence here, because loads are in-order */
        "jmp 1b
"
        "2:"
        : "+Q" (inc), "+m" (lock->slock)
        :
        : "memory", "cc");
}
static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock)
{
    int inc = 0x00010000;
    int tmp;

    asm volatile(LOCK_PREFIX "xaddl %0, %1
"
             "movzwl %w0, %2
	"
             "shrl $16, %0
	"
             "1:	"
             "cmpl %0, %2
	"
             "je 2f
	"
             "rep ; nop
	"
             "movzwl %1, %2
	"
             /* don't need lfence here, because loads are in-order */
             "jmp 1b
"
             "2:"
             : "+r" (inc), "+m" (lock->slock), "=&r" (tmp)
             :
             : "memory", "cc");
}

I have two question:

1.What's the difference between the two functions above?

2.What can I do to monitor the spinlock waiting time(the time it takes to first try the lock and finally get the lock)?does the variable inc means the spinlock waiting time?

Kerrek SB · Accepted Answer

Let me first explain how the spinlock code works. We have variables

uint16_t inc = 0x0100,
         lock->slock;     // I'll just call this "slock"

In the assembler code, inc is referred to as %0 and slock as %1. Moreover, %b0 denotes the lower 8 bit, i.e. inc % 0x100, and %h0 is inc / 0x100.

Now:

lock xaddw %w0, %1    ;; "inc := slock"  and  "slock := inc + slock"
                      ;; simultaneously (atomic exchange and increment)
1:
    cmpb %h0, %b0     ;; "if (inc / 256 == inc % 256)"
    je 2f             ;; "    goto 2;"
    rep ; nop         ;; "yield();"
    movb %1, %b0      ;; "inc = slock;"
    jmp 1b            ;; "goto 1;"
2:

Comparing the upper and lower byte of inc succeeds if inc is zero. Since inc has the value of the original lock, this happens if the lock is unlocked. In that case, the lock will already have been incremented to non-zero by the atomic exchange-and-increment, so it is now locked.

Otherwise, i.e. if the lock had already been locked, we pause a little, then update inc to the current value of the lock, and try again.

(I believe there's actually a possiblity for an overflow, if 2⁸ threads simultaneously attempt to get the spinlock. In that case, slock is updated to 0x0100, 0x0200, ... 0xFF00, 0x0000, and would then appear to be unlocked. Maybe that's why the second version of the code uses a 16-bit wide counter, which would require 2¹⁶ simultaneous attempts.)

Now let's insert a counter:

uint32_t spincounter = 0;

asm volatile( /* code below */
    : "+Q" (inc), "+m" (lock->slock)
    : "=r" (spincounter)
    : "memory", "cc");

Now spincounter may be referred to as %2. We just need to increment the counter each time:

1:
    inc %2
    cmpb %h0, %b0
    ;; etc etc

I haven't tested this, but that's the general idea.

How to monitor linux spinlock waiting time?

Answers (1)

Related Questions