Reputation: 711
recently I have seen several SO threads related to Parallel.ForEach mixed with async lambdas, but all proposed answers were some kind of workarounds.
Is there any way how could I write:
List<int> list = new List<int>[]();
Parallel.ForEach(arrayValues, async (item) =>
{
var x = await LongRunningIoOperationAsync(item);
list.Add(x);
});
How can I ensure that list will contain all items from all iterations executed withing lambdas in each iteration?
How will generally Parallel.ForEach work with async lambdas, if it hit await will it hand over its thread to next iteration?
I assume ParallelLoopResult IsCompleted field is not proper one, as it will return true when all iterations are executed, no matter if their actual lambda jobs are finished or not?
Upvotes: 3
Views: 1329
Reputation: 457402
recently I have seen several SO threads related to Parallel.ForEach mixed with async lambdas, but all proposed answers were some kind of workarounds.
Well, that's because Parallel
doesn't work with async
. And from a different perspective, why would you want to mix them in the first place? They do opposite things. Parallel
is all about adding threads and async
is all about giving up threads. If you want to do asynchronous work concurrently, then use Task.WhenAll
. That's the correct tool for the job; Parallel
is not.
That said, it sounds like you want to use the wrong tool, so here's how you do it...
How can I ensure that list will contain all items from all iterations executed withing lambdas in each iteration?
You'll need to have some kind of a signal that some code can block on until the processing is done, e.g., CountdownEvent
or Monitor
. On a side note, you'll need to protect access to the non-thread-safe List<T>
as well.
How will generally Parallel.ForEach work with async lambdas, if it hit await will it hand over its thread to next iteration?
Since Parallel
doesn't understand async
lambdas, when the first await
yields (returns) to its caller, Parallel
will assume that interation of the loop is complete.
I assume ParallelLoopResult IsCompleted field is not proper one, as it will return true when all iterations are executed, no matter if their actual lambda jobs are finished or not?
Correct. As far as Parallel
knows, it can only "see" the method to the first await
that returns to its caller. So it doesn't know when the async
lambda is complete. It also will assume iterations are complete too early, which throws partitioning off.
Upvotes: 7
Reputation: 81583
You don't need Parallel.For/ForEach
here you just need to await a list of tasks.
Background
In short you need to be very careful about async lambdas, and if you are passing them to an Action
or Func<Task>
Your problem is because Parallel.For / ForEach
is not suited for the async and await pattern or IO bound tasks. They are suited for cpu bound workloads. Which means they essentially have Action
parameters and let's the task scheduler create the tasks for you
If you want to run multiple async tasks at the same time use Task.WhenAll
, or a TPL Dataflow Block (or something similar) which can deal effectively with both CPU bound and IO bound works loads, or said more directly, they can deal with tasks which is what an async method is.
Unless you need to do more inside of your lambda (for which you haven't shown), just use aSelect
and WhenAll
var tasks = items .Select(LongRunningIoOperationAsync);
var results = await Task.WhenAll(tasks); // here is your list of int
If you do, you can still use the await,
var tasks = items.Select(async (item) =>
{
var x = await LongRunningIoOperationAsync(item);
// do other stuff
return x;
});
var results = await Task.WhenAll(tasks);
Note : If you need the extended functionality of Parallel.ForEach
(namely the Options to control max concurrency), there are several approach, however RX or DataFlow might be the most succinct
Upvotes: 5