Reputation: 68668
// SubFetch(x,y) = atomically x-=y and return x (__sync_sub_and_fetch)
// AddFetch(x,y) = atomically x+=y and return x (__sync_add_and_fetch)
// CompareWait(x, y) = futex(&x, FUTEX_WAIT, y) wait on x if x == y
// Wake(x, y) = futex(&x, FUTEX_WAKE, y) wake up y waiters
struct Lock
{
Lock() : x(1) {}
void lock()
{
while (true)
{
if (SubFetch(x, 1) == 0)
return;
x = -1;
CompareWait(x, -1);
}
}
void unlock()
{
if (AddFetch(x, 1) == 1)
return;
x = 1;
Wake(x, 1);
}
private:
int x;
};
Linux 3.0 provides a system call called futex, upon which many concurrency utilities are based including recent pthread_mutex implementations. Whenever you write code you should always consider whether using an existing implementation or writing it yourself is the better choice for your project.
Above is an implementation of a Lock (mutex, 1 permit counting semaphore) based upon futex and the semantics description in man futex(7)
It appears to contain a deadlock bug whereby after multiple threads are trying to lock and unlock it a few thousand times, the threads can get into a state where x == -1 and all the threads are stuck in CompareWait, however noone is holding the lock.
Can anyone see where the bug is?
Update: I'm a little surprised that futex(7)/semantics is so broken. I completely rewrote Lock as follows... is this correct now?
// CompareAssign(x,y,z) atomically: if (x == y) {x = z; ret true; } else ret false;
struct Lock
{
Lock() : x(0) {}
void lock()
{
while (!CompareAssign(x, 0, 1))
if (x == 2 || CompareAssign(x, 1, 2))
CompareWait(x, 2);
}
void unlock()
{
if (SubFetch(x, 1) == 0)
return;
x = 0;
Wake(x, 1);
}
private:
int x;
};
The idea here is that x has the following three states:
0: unlocked
1: locked & no waiters
2: locked & waiters
Upvotes: 1
Views: 3366
Reputation: 306
The proper implementation of a futex-based Mutex is described in Ulrich Drepper's paper "Futexes are tricky"
http://people.redhat.com/drepper/futex.pdf
It includes not only the code but also a very detailed explanation of why it is correct. The code from the paper:
class mutex
{
public:
mutex () : val (0) { }
void lock () {
int c;
if ((c = cmpxchg (val, 0, 1)) != 0)
do {
if (c == 2 || cmpxchg (val, 1, 2) != 0)
futex_wait (&val, 2);
} while ((c = cmpxchg (val, 0, 2)) != 0);
}
void unlock () {
//NOTE: atomic_dec returns the value BEFORE the operation, unlike your SubFetch !
if (atomic_dec (val) != 1) {
val = 0;
futex_wake (&val, 1);
}
}
private:
int val;
};
Comparing the code in the paper with your code, I spot a difference
You have
if (x == 2 || CompareAssign(x, 1, 2))
using the futex's value directly whereas Drepper uses the return value from the previous CompareAssign(). That difference will probably affect performance only.
Your unlock code is different, too, but seems to be semantically equivalent.
In any case I would strongly advise you to follow Drepper's code to the letter. That paper has stood the test of time and received a lot of peer review. You gain nothing from rolling your own.
Upvotes: 3
Reputation: 340228
How about this scenario with three threads, A, B , and C.
The initial state of this scenario has:
CompareWait()
x == -1
from when C failed to acquire the lockA B C ============== ================ =============== AddFetch() (so x == 0) SubFetch() (so x == -1) x = 1 x = -1 Wake()
At this point whether B or C are unblocked, they will not get a result of 0
when they SubFetch()
.
Upvotes: 0
Reputation: 68611
The problem is that you explicitly -1 assign to x
if the SubFetch
fails to acquire the lock. This races with the unlock.
x==0
.SubFetch
sets x
to -1, and then thread 2 is suspended.AddFetch
sets x
to 0, so the code then explicitly sets x
to 1 and calls Wake
.x
to -1, and then calls CompareWait
.Thread 2 is now stuck waiting, with x
set to -1, but there is no one around to wake it, as thread 1 has already released the lock.
Upvotes: 4