Reputation: 513
I have a function that runs asynchronously. Unfortunately, it only spits out the correct answer on occasion. The values represented by futures[i].get()
change each time I run the code. I'm new to multithreading.
double async_func() const {
vector<future<double>> futures;
double val = 0;
for (int i = 0; i < rows; i++) {
futures.push_back(std::async(std::launch::async, [&] {return minor(i,0).determinant();}) );
}
for (int i = 0; i < rows; i++)
val += futures[i].get();
return val;
}
Upvotes: 2
Views: 157
Reputation: 10355
The problem lies in the capture by reference.
for (int i = 0; i < rows; i++)
{
futures.push_back(std::async(std::launch::async, [&] // capture by reference!
{
return minor(i,0).determinant();
}));
}
This is a problem, because each executed method doesn't get its own value of i
, but a reference to the loop variable i
. Hence, as soon as i
is evaluated for the function call minor(i, 0)
, it may have already changed. Even worse, the function call may not have been executed yet before the end of the loop. Hence i
may have been destroyed before it's been used.
If the above description isn't enough, maybe this timeline can help. Note that this is not a real situation, it's just an illustration.
Assume that iterating i
takes 2 time steps, evaluating the parameters of the function takes 2 steps and starting a thread takes 5 steps (on a real machine probably way more!). The job done by each thread may take several thousand steps (that's why we parallelize them, right?).
Time -->
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
| | | | | | | | | | | | | | Thread 3 function call.
| | | | | | | | | | | | | Thread 3 evaluate parameters (i == garbage!).
| | | | | | | | | | | | Thread 3 start; Thread 2 function call.
| | | | | | | | | | | Thread 2 evaluate parameters (i == garbage!).
| | | | | | | | | | Destroy i; Thread 2 start; Thread 1 function call.
| | | | | | | | | Loop condition broken; Thread 1 evaluate parameters (i == 4 !).
| | | | | | | | Thread 0 function call; Thread 1 start; i = 4;
| | | | | | | Thread 3 request to start; Thread 0 evaluate parameters (i == 3 !).
| | | | | | Thread 0 start; i = 3;
| | | | | Thread 2 request to start.
| | | | i = 2.
| | | Thread 1 request to start.
| | i = 1.
| Thread 0 request to start.
Create i, i = 0.
So some threads get meaningful data, others won't. Depending on actual timings, it might be possible to increment i
at the same time as evaluating its value for the function parameter in which case you have a race condition.
The resolution is by capturing i
by value:
for (int i = 0; i < rows; i++)
{
futures.push_back(std::async(std::launch::async, [=] // capture by value
{
return minor(i,0).determinant();
}));
}
Recommendation: when dealing with asynchronous bits, be explicit about what your capture:
for (int i = 0; i < rows; i++)
{
futures.push_back(std::async(std::launch::async, [i] // explicit capture by value
{
return minor(i,0).determinant();
}));
}
You also should avoid having reallocations to your vector of futures. Simply add futures.reserve(rows)
before the loop.
Upvotes: 6