Capturing a thread_local variable by reference in lambda does not work as expected

Question

I'm currently building a system where I have multiple threads running, and one thread can queue work to another thread and wait for completion. I'm using mutexes and condition_variables for synchronization. In order to avoid creating a new mutex and cv for each operation I wanted to optimize it and tried to use a thread_local mutex/cv pair for each thread that is waiting. However this was unexpectedly not working, and I would be interesting why.

Basically my code which queues work into the other thread and waits for it looks like:

/* thread_local */ std::mutex mtx;
/* thread_local */ std::condition_variable cv;
bool done = false;  

io_service.post([&]() {
    // Execute the handler in context of the io thread
    functionWhichNeedsToBeCalledInOtherThread();

    // Signal completion to unblock the waiter
    {
        std::lock_guard lock(mtx);
        done = true;
    }
    cv.notify_one();
});

// Wait until queued work has been executed in io thread
{
    std::unique_lock lk(mtx);
    while (!done) cv.wait(lk);
}

This works fine if the synchronization objects are not thread_local. When I add thread_local the waiting thread waits forever, which indicates that the condition variable is never signaled. I now have the feeling that despite capturing the objects by reference the thread_local objects of the other thread are used inside the lambda. I can even confirm that the capture is not doing the correct thing by checking the address of mtx inside and outside of the lambda -> They don't match.

The question is:

Is this a bug in the compiler or by design? I'm using Visual Studio 2015 and haven't checked with other compilers yet.
Is capturing thread_local variables by reference even permitted?

I can work around the error by creating an explicit reference to the thread_local variables outside of the lambda and using those references inside it. However I think the behavior is unexpected and would love to hear an explanation whether this is correct behavior or not.

yuri kilochek · Accepted Answer

What you observe is the correct behavior, as you are not actually capturing anything. Static and thread storage duration objects are accessible directly, so in the interest of efficiency [&]-capture has no effect on those. You can however capture appropriate thread local instances explicitly:

io_service.post([&mtx = mtx, &cv = cv]() {

Capturing a thread_local variable by reference in lambda does not work as expected

Answers (2)

Related Questions