Trouble finding race condition in locking code with atomics & futexes

Question

I'm having a horrible time trying to debug a race-condition-causing-deadlock in my Linux-futex & atomic-ops based locking primitives. Here's the code I'm working with (exact same logic as the real code, just pulled out the dependency on data structures that aren't relevant to the problem):

int readers, writer, waiting;

void wrlock()
{
    int cur;
    while (atomic_swap(&writer, 1)) spin();
    while ((cur=readers)) futex_wait(&readers, cur);
}

void wrunlock()
{
    atomic_store(&writer, 0);
    if (waiting) futex_wake(&writer, ALL);
}

void rdlock()
{
    atomic_inc(&waiting);
    for (;;) {
        atomic_inc(&readers);
        if (!writer) return;
        atomic_dec(&readers);
        futex_wait(&writer, 1);
    }
}

void rdunlock()
{
    atomic_dec(&waiting);
    atomic_dec(&readers);
    if (writer) futex_wake(&readers, ALL);
}

The atomic_* and spin functions are pretty obvious. Linux futex ops are futex_wait(int *mem, int val) and futex_wake(int *mem, int how_many_to_wake).

The deadlock condition I'm running into is 3 threads, readers==0, writer==1, waiting==2, and all threads waiting on futex_wait. I don't see how this can happen.

And for everyone who wants to yell at me for not using pthread primitives, please save it for another question. This is low-level code which operates without dependence on glibc/libpthread. In any case I think the question is probably useful to others for learning about low-level concurrency black magic, or maybe to scare anyone else away from trying to write code like this... ;-)

By the way, the hardware is x86, so even if there are memory-ordering issues with the code, I don't think they'd manifest as bugs. I'm guessing there's just a subtle misuse of futexes I'm missing, especially since the code works fine when all the waits are dummied out as spins.

Here's the generated asm for wrlock (basically identical to the version I posted except it's calling a separate function lock for the first spinlock). The additional conditional return at the beginning is "if we're not running multiple threads, bail out". 0x338 and 0x33c correspond to readers and writer. call 1af is actually a relocation to call futex_wait which is external.

00000184 :
 184:   a1 18 00 00 00          mov    0x18,%eax
 189:   55                      push   %ebp
 18a:   85 c0                   test   %eax,%eax
 18c:   89 e5                   mov    %esp,%ebp
 18e:   75 02                   jne    192 
 190:   c9                      leave
 191:   c3                      ret
 192:   68 3c 03 00 00          push   $0x33c
 197:   e8 7e fe ff ff          call   1a 
 19c:   58                      pop    %eax
 19d:   a1 38 03 00 00          mov    0x338,%eax
 1a2:   85 c0                   test   %eax,%eax
 1a4:   74 ea                   je     190 
 1a6:   6a 01                   push   $0x1
 1a8:   50                      push   %eax
 1a9:   68 38 03 00 00          push   $0x338
 1ae:   e8 fc ff ff ff          call   1af 
 1b3:   a1 38 03 00 00          mov    0x338,%eax
 1b8:   83 c4 0c                add    $0xc,%esp
 1bb:   85 c0                   test   %eax,%eax
 1bd:   75 e7                   jne    1a6 
 1bf:   eb cf                   jmp    190

Michael Burr · Accepted Answer

I think this illustrates your potential deadlock. Assume a single processor executing your 3 threads in the following sequence:

// to start, 
//   readers == 0, writer == 0, waiting == 0

Reader1
===================================

// in rdlock()
    atomic_inc(&waiting);
    for (;;) {
        atomic_inc(&readers);

    // if (!writer) has not been executed yet

    //   readers == 1, writer == 0, waiting == 1

writer 
===================================
// in wrlock()
while (atomic_swap(&writer, 1)) spin();
    while ((cur=readers)) futex_wait(&readers, cur)

    // writer thread is waiting

    // readers == 1, writer == 1, waiting == 1
    //   cur == 1



Reader1
===================================

// back to where it was in rdlock()
        if (!writer) return;
        atomic_dec(&readers);
        futex_wait(&writer, 1);

    // Reader1 is waiting
    //   readers == 0, writer == 1, waiting == 1

Reader2
===================================

// in rdlock()
    atomic_inc(&waiting);
    for (;;) {
        atomic_inc(&readers);
        if (!writer) return;
        atomic_dec(&readers);
        futex_wait(&writer, 1);

    // Reader2 is waiting
    //  readers == 0, writer == 1, waiting == 2

Now you're deadlocked.

Of course, I might not be understanding how the futex API works (I've never used them), so let me know if I'm off base here. I assume that a futex_wait() that blocks (because the expected value was correct) will not unblock until there's a futex_wake() call for that address.

If atomic_xxx() operations can unblock a futex_wait(), this analysis is incorrect.

Finally, if this is what's happening to you, I haven't had a chance to think about possible solutions...

Trouble finding race condition in locking code with atomics & futexes

Answers (2)

Related Questions

Trouble finding race condition in locking code with atomics &amp; futexes

Answers (2)

Related Questions

Trouble finding race condition in locking code with atomics & futexes