Reputation: 149
Here's a fine-grained locking queue introduced by Anthony Williams in chapter 6.2.3 C++ Concurrency in Action.
/*
pop only need lock head_mutex and a small section of tail_mutex,push only need
tail_mutex mutex.maximum container concurrency.
*/
template<typename T> class threadsafe_queue
{
private:
struct node
{
std::shared_ptr<T> data;
std::unique_ptr<node> next;
}
std::mutex head_mutex; //when change the head lock it.
std::unique_ptr<node> head;
std::mutex tail_mutex; //when change the tail lock it.
node* tail;
std::condition_variable data_cond;
node* get_tail()
{
std::lock_guard<std::mutex> tail_lock(tail_mutex);
return tail;
}
public:
/*
create a dummy node
*/
threadsafe_queue():
head(new node),tail(head.get())
{}
std::shared_ptr<T> wait_and_pop()
{
std::unique_lock<std::mutex> head_lock;
data_cond.wait(head_lock,[&]{return head.get()!=get_tail();}); //#1
std::unique_ptr<node> old_head=std::move(head);
head=std::move(old_head->next);
return old_head;
}
void push(T new_value)
{
std::shared_ptr<T> new_data(
std::make_shared<T>(std::move(new_value)));
std::unique_ptr<node> p(new node);
{
std::lock_guard<std::mutex> tail_lock(tail_mutex);
tail->data=new_data;
node* const new_tail=p.get();
tail->next=std::move(p);
tail=new_tail;
}
data_cond.notify_one();
}
}
Here's the situation: There are two threads (thread1
and thread2
). thread1
is doing a wait_and_pop
and thread2
is doing a push
. The queue is empty.
thread1
is in #2, had already checked head.get()!=get_tail()
before data_cond.wait()
. At this time its CPU period had run out. thread2
begins.
thread2
finished the push
function and did data_cond.notify_one()
. thread1
begins again.
Now thread1
begins data_cond.wait()
, but it waits forever.
Would this situation possibly happen ?If so, how to get this container fixed ?
Upvotes: 4
Views: 2205
Reputation: 42554
Yes, the situation described in the OP is possible and will result in notifications being lost. Injecting a nice big time delay in the predicate function makes it easy to trigger. Here's a demonstration at Coliru. Notice how the program takes 10 seconds to complete (length of the timeout to wait_for
) instead of 100 milliseconds (time when the producer inserts an item in the queue). The notification is lost.
There is an assumption implicit in the design of condition variables that the state of the condition (return value of the predicate) cannot change while the associated mutex is locked. This is not true for this queue implementation since push
can change the "emptiness" of the queue without holding head_mutex
.
§30.5p3 specifies that wait
has three atomic parts:
- the release of the mutex, and entry into the waiting state;
- the unblocking of the wait; and
- the reacquisition of the lock.
Note that none of these mention checking of the predicate, if any was passed to wait
. The behavior of wait
with a predicate is described in §30.5.1p15:
Effects:
while (!pred()) wait(lock);
Note that there is no guarantee here either that the predicate check and the wait
are performed atomically. There is a pre-condition that lock
is locked and it's associated mutex held by the calling thread.
As far as fixing the container to avoid loss of notifications, I would change it to a single mutex implementation and be done with it. It's a bit of a stretch to call it fine-grained locking when the push
and pop
both end up taking the same mutex (tail_mutex
) anyway.
Upvotes: 7
Reputation: 63471
data_cond.wait()
checks the condition every time it is woken up. So even though it may have already been checked, it will be checked again after data_cond.notify_one()
. At that point, there is data to be popped (because Thread 2 had just completed a push), and so it returns. Read more here.
The only thing you should be worrying about is when you call wait_and_pop
on an empty queue and then never push any more data onto it. This code does not have a mechanism for timing out a wait and returning an error (or throwing an exception).
Upvotes: 0