Reputation: 5699
The code below should print two same integers, but that's not the case. An analogous program in JavaScript would print two same numbers.
It seems reasonable in C++, because when the stdfun
is executed, regfun
already finished, the local_var
doesn't exist anymore at that time.
So my question is how can we correctly access the captured local variable beyond the lifetime of its context like JavaScript does per default?
#include <functional>
#include <future>
#include <cmath>
#include <iostream>
#include <ctime>
#include <windows.h>
using namespace std;
int start_stdfun_in_a_new_thread(std::function<void()> stdfun)
{
int id = rand();
std::function<void()> call = [=]()
{
Sleep(1000);//let regfun finish
stdfun();
};
std::async(std::launch::async,call);
return id;
}
void regfun()
{
int local_var = -1;
std::function<void()> stdfun = [=,&local_var]() mutable -> void
{
cout<<local_var<<endl;
};
local_var = start_stdfun_in_a_new_thread(stdfun);
cout<<local_var<<endl;
}
int main()
{
regfun();
Sleep(1000000);
}
It is so hard to describe what my question really is, but I just need something in c++ like what we do in javascript. If you are very famillar with javascript, maybe you can understand what I fully mean.
Upvotes: 2
Views: 1596
Reputation: 275878
Here is a C++ version of your code that no longer does undefined behavior.
As a happy side effect, it also exhibits the behavior you want:
int start_stdfun_in_a_new_thread(std::function<void()> stdfun)
{
int id = rand();
// no need for this to be a type erased `std::function`:
auto call = [=]()
{
Sleep(1000);//let regfun finish
stdfun();
};
std::async(std::launch::async,call);
return id;
}
void regfun()
{
// here we create a shared pointer to an int and store it locally:
auto local_var = std::make_shared<int>(-1);
// we take the shared pointer and copy it by value into our lambda:
// note that we only type erase (turn it into a std::function) when we need to,
// no earlier.
auto stdfun = [local_var]() mutable -> void
{
cout<<*local_var<<endl;
};
// the move is just an optimization. It works with or without it.
*local_var = start_stdfun_in_a_new_thread(std::move(stdfun));
cout<<*local_var<<endl;
}
int main()
{
regfun();
Sleep(1000000);
}
Lifetime of data in C++ is relatively simple, barring a few corner cases1. Variables create their data in automatic storage, and when the variable goes out of scope the data is also recycled. Unnamed data (temporaries) exist for the length of the current statement (until the ;
), but under certain circumstances can have their lifetime extended to that of a nearby reference variable. This, however, is not transitory: so under no circumstance do unnamed objects ever outlast the {}
enclosed block they are created in.2
You can also create data on the free store via new
(or via other methods, like malloc
). Such data has a more complicated lifetime, but generally until some code somewhere says "I am done with it" explicitly.
std::shared_ptr
is a class used to give your data a more complicated lifetime. You can store data created via new
in a std::shared_ptr
, but the more conventional way is to use std::make_shared<
TYPE>(
construction arguments )
to create it (this is, under most circumstances, both safer and more efficient than the other method).
Each std::shared_ptr
variable has a conventional lifetime, however the data it points to has a lifetime bounded by the last copy of the shared_ptr
that points to it. (I say copy, because this isn't magic: if you have to unrelated shared_ptr
to one piece of data, they will (usually) be clueless about their common data, and both think they "own" it).
So you can create a std::shared_ptr<int>
and pass it around by value, and the data it points to will live as long as any of the pointers do. When the last pointer goes away, the data is cleaned up.
Which is what I did above. The lifetime of what local_var
points to is automatically extended to be both the body of regfun
, and the lifetime of stdfun
and copies thereof (including the copy that will live in a std::function
).
The final technique I used profusely was the use of auto
. auto
is a way of creating variables that gets it type from how the variable is initialized. For lambdas, this lets you store them directly (instead of as a type-erased std::function
), as the name of a lambda cannot be uttered.
In other cases, the right hand side already details the type involved (std::make_shared<int>
makes the type clear), and repeating it on the left hand side fails to add much clarity, and violates the DRY (don't repeat yourself) principle.
I use std::function
sparingly, as std::function
is mainly about type erasure. Type erasure is the act of taking a type with all its details, and wrapping it up in a custom crafted box that erases all of those details and leaves a uniform run time interface. While this is cool and all, it comes with a run time (and sometimes compile time) cost. You should engage in type erasure (like std::function
) sparingly: in interfaces where the implementation is hidden, when you want to treat multiple different types (such as function pointers, and multiple different lambdas) in a uniform way and store them, or when you want to use lambdas and would be annoyed by your inability to utter their name.
1 Lifetime of data in C++ is simple, barring a few corner cases. In those corner cases, lifetime can get extremely complex. The most simple of these corner cases is temporary lifetime extension when bound directly to a reference. Other fun corner cases includes the concept of "safely derived pointers" and "strict aliasing", which come into play when you mess around with the types and bits of pointers. Other complex corner cases include static
local variables, static
and non-static global variables, and thread-local variables, and copy elision2.1. The most basic advice I can give you would be to simply avoid most of these corner cases, and if you find you really really need to use them, spend a whole bunch of time reading up on the implied lifetime and the various common traps and errors that occur. The lifetime issues are sufficiently complex that "I tried it and it worked" is not strong evidence that your code is correct -- undefined, implementation specified, or extremely fragile behavior is ridiculously easy to trigger in all of these cases.
2 This sentence is a lie. 2.1Copy elision can cause an unnamed local variable to have a lifetime far longer than the enclosing block, but only because it conceptually became other named or unnamed variables, and the copy constructor/destructors where all elided (eliminated). However, it is a useful lie, as it helps block thinking about some really common "reference lifetime extension" misconceptions.
Upvotes: 3
Reputation: 45725
Your local_var
is bound to the local context, i.e. it's dead when regfun
exits, which happens almost immediately. But your lambda captures it by reference, which means that's a dangling reference when executing stdfun
later, since by then local_var
is already dead.
So that's undefined behavior. What you would need (what JavaScript does) is extending the lifetime of a captured variable. But that's not the case with C++11 lambdas, as explained in http://en.cppreference.com/w/cpp/language/lambda:
Dangling references
If an entity is captured by reference, implicitly or explicitly, and the function call operator of the closure object is invoked after the entity's lifetime has ended, undefined behavior occurs. The C++ closures do not extend the lifetimes of the captured references.
I see two solutions:
One solution would be heap allocation of the object you want to capture, pointing to it with a std::shared_ptr
, and capture this pointer by value (which will copy it into the lambda instance). The last shared pointer instance will then delete the heap allocated object for you.
If possible you can also define it locally in some other context outside which in every case outlives the lifetime of its usage. (In your simple code that would be main
, but in the most cases you want to wait for the threads; the context which both starts and waits for the threads is most probably the correct context for this.) Then pass this to regfun
by reference, and also capture it by reference. So even when regfun
exits, it is still a valid reference (if it's in an outliving context).
Upvotes: 6