Reputation: 107
I'm having a problem where I'm having a few condition_variable
's get stuck in their wait
phase even though they've been notified. Each one even has a predicate that's being set just in case they miss the notify call from the main thread.
Here's the code:
unsigned int notifyCount = 10000;
std::atomic<int> threadCompletions = 0;
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&, condition, wakeUp]()
{
std::mutex mutex;
std::unique_lock<std::mutex> lock(mutex);
condition->wait(lock, [wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
condition->notify_one();
}
Sleep(5000); // Sleep for 5 seconds just in case some threads are taking a while to finish executing
// Check how many threads finished (threadCompletions should be equal to notifyCount)
Unless I'm mistaken, after the for loop is done, threadCompletions
should always be equal to notifyCount
. Very often though, it is not.
When running in release, I'll sometimes get just one or two out of 10000 threads that never finished, but when running in debug, I'll get 20 or more.
I thought maybe the wait
call in the thread is happening after the main thread's notify_one
call (meaning it missed it's notification to wake up), so I passed a predicate into wait
to insure that it doesn't get stuck waiting. But it still does in some cases.
Does anyone know why this is happening?
Upvotes: 3
Views: 2059
Reputation: 264729
You are assuming the call to wait()
is atomic. I don't believe it is. That is why it requires the use of a mutex and a lock.
Consider the following:
Main Thread. Child Thread
// This is your wait unrolled.
while (!wakeUp->load()) {
// This is atomic
// But already checked in the
// thread.
*wakeUp = true;
// Child has not yet called wait
// So this notify_one is wasted.
condition->notify_one();
// The previous call to notify_one
// is not recorded and thus the
// thread is now locked in this wait
// never to be let free.
wait(lock);
}
// Your race condition.
Calls to notify_one()
and wait()
should be controlled via the same mutext to make sure they don't overlap like this.
for (unsigned int i = 0; i < notifyCount; i++)
{
std::atomic<bool>* wakeUp = new std::atomic<bool>(false);
std::mutex* mutex = new std::mutex{};
std::condition_variable* condition = new std::condition_variable();
// Worker thread //
std::thread([&]()
{
std::unique_lock<std::mutex> lock(*mutex);
condition->wait(lock, [&wakeUp] { return wakeUp->load(); });
threadCompletions++;
}).detach();
// Notify //
*wakeUp = true;
std::unique_lock<std::mutex> lock(*mutex);
condition->notify_one();
}
// Don't forget to clean up the new structures correctly/.
Upvotes: 5
Reputation: 4713
You have data racing. Consider following scenario:
Worker Thread: condition variable tests for whether wakeup is true - it isn't
Main Thread: wakeup is set to true and condition variable is getting notified
Worker Thread: condition_variable triggers wait but it happens after notification already occurred - impling that notification misses and the thread might never wake up.
Normally, synchronization of condition variables is done via mutexes - atomics aren't too helpful here. In C++20 there will be special mechanism for waiting/notifying in atomics.
Upvotes: 1