Matthias247
Matthias247

Reputation: 10416

Capturing a thread_local variable by reference in lambda does not work as expected

I'm currently building a system where I have multiple threads running, and one thread can queue work to another thread and wait for completion. I'm using mutexes and condition_variables for synchronization. In order to avoid creating a new mutex and cv for each operation I wanted to optimize it and tried to use a thread_local mutex/cv pair for each thread that is waiting. However this was unexpectedly not working, and I would be interesting why.

Basically my code which queues work into the other thread and waits for it looks like:

/* thread_local */ std::mutex mtx;
/* thread_local */ std::condition_variable cv;
bool done = false;  

io_service.post([&]() {
    // Execute the handler in context of the io thread
    functionWhichNeedsToBeCalledInOtherThread();

    // Signal completion to unblock the waiter
    {
        std::lock_guard<std::mutex> lock(mtx);
        done = true;
    }
    cv.notify_one();
});

// Wait until queued work has been executed in io thread
{
    std::unique_lock<std::mutex> lk(mtx);
    while (!done) cv.wait(lk);
}

This works fine if the synchronization objects are not thread_local. When I add thread_local the waiting thread waits forever, which indicates that the condition variable is never signaled. I now have the feeling that despite capturing the objects by reference the thread_local objects of the other thread are used inside the lambda. I can even confirm that the capture is not doing the correct thing by checking the address of mtx inside and outside of the lambda -> They don't match.

The question is:

I can work around the error by creating an explicit reference to the thread_local variables outside of the lambda and using those references inside it. However I think the behavior is unexpected and would love to hear an explanation whether this is correct behavior or not.

Upvotes: 5

Views: 2452

Answers (2)

Galik
Galik

Reputation: 48635

For a mutex to work every thread needing synchronization must lock the same mutex. What thread_local does is create a different mutex for each thread. If your threads each have their own, independent mutex they can't possibly communicate through them. You need one mutex for all your threads to share.

The same is true of condition variables. All threads need to be 'talking' to the same condition variable. That means it does not makes sense to have a separate condition variable for each thread.

Regarding your lambda, each thread that instantiates the lambda will capture it's own copy of the thread_local variables. Given that the mutex and condition variable you access from the lambda are otherwise accessed from a different thread there is no synchronization as your lambda is working with a completely different set of variables.

Upvotes: 3

yuri kilochek
yuri kilochek

Reputation: 13484

What you observe is the correct behavior, as you are not actually capturing anything. Static and thread storage duration objects are accessible directly, so in the interest of efficiency [&]-capture has no effect on those. You can however capture appropriate thread local instances explicitly:

io_service.post([&mtx = mtx, &cv = cv]() {

Upvotes: 3

Related Questions