ttemple
ttemple

Reputation: 882

Should condition_variable.notify_all be covered by mutex lock?

I have implemented a class that allows me to synchronize threads with a condition variable. I have found conflicting information as to whether the notify_all should be done within the lock, or outside of the lock. I have found examples constructed both ways.

The argument for releasing the lock first is to prevent the waiting thread from blocking on the mutex right after being released by the notify.

The argument against releasing the lock first is an assertion that notifications could be missed by waiting thread(s).

The two versions of the release function are here:

// version 1 - unlock then notify.
void release(int address = 1)
{
    {
        std::lock_guard<std::mutex> lk(_address_mutex);
        _address = address;
    }
    _cv.notify_all();
}

// version 2 - notify then unlock
void release(int address = 1)
{
    std::lock_guard<std::mutex> lk(_address_mutex);
    _address = address;
    _cv.notify_all();
}

For reference, the wait code looks like this:

bool wait(const std::chrono::microseconds dur, int address = 1)
{
    std::unique_lock<std::mutex> lk(_address_mutex);
    if (_cv.wait_for(lk, dur, [&] {return _address == address; }))
    {
        _address = 0;
        return true;
    }
    return false;
}

Is there a risk of waiting threads missing notifications in version 1, where the mutex is allowed to go out of scope before the notify_all? If so, how does it happen? (It is not obvious to me how this causes missed notifications.)

I can see clearly how keeping the mutex locked during the notify causes the waiting threads to immediately go into a wait. But this is a small price to pay if it prevents missed notifies.

Upvotes: 5

Views: 2173

Answers (2)

Arno Schoedl
Arno Schoedl

Reputation: 169

There is a scenario where it is crucial that the lock is held while notify_all is called: when the condition_variable is destroyed after waiting, a spurious wake-up can cause the wait to be ended and the condition_variable destroyed before notify_all is called on the already destroyed object.

Upvotes: 2

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275385

There is no risk releasing the lock if the mutex was held in some interval between the state of the condition test changing and thr notification.

{
    std::lock_guard<std::mutex> lk(_address_mutex);
    _address = address;
}
_cv.notify_all();

here, the mutex was unlocked after _address was changed. So no risk.

If we modify _address to be atomic, naively this looks correct:

{
    std::lock_guard<std::mutex> lk(_address_mutex);
}
_address = address;
_cv.notify_all();

but it is not; here, the mutex is released for the entire period between modifying the condition test and notification,

_address = address;
{
    std::lock_guard<std::mutex> lk(_address_mutex);
}
_cv.notify_all();

The above, however, becomes correct again (if more than a little strange).


The risk is that the condition test will be evaluated with the mutex active (as false), then changed, then notification sent, then the waiting thread waits on a notification and releases the mutex.

waiting|signalling
lock
test
        test changed
        notification
listen+unlock

the above is an example of a missed notification.

So long as we have the mutex held anywhere after the test change and before the notification it cannot happen.

Upvotes: 9

Related Questions