Reputation: 43485
I wrote a PLINQ query that ends with the ForAll
operator, and I used the WithCancellation
operator in order to cancel the query midway. Surprisingly the query is not canceled. Here is a minimal demonstration of this behavior:
CancellationTokenSource cts = new CancellationTokenSource(1000);
cts.Token.Register(() => Console.WriteLine("--Token Canceled"));
try
{
Enumerable.Range(1, 20)
.AsParallel()
.WithDegreeOfParallelism(2)
.WithCancellation(cts.Token)
.ForAll(x =>
{
Console.WriteLine($"Processing item #{x}");
Thread.Sleep(200);
//cts.Token.ThrowIfCancellationRequested();
});
Console.WriteLine($"The query was completed successfully");
}
catch (OperationCanceledException)
{
Console.WriteLine($"The query was canceled");
}
Output (undesirable):
Processing item #1
Processing item #2
Processing item #4
Processing item #3
Processing item #5
Processing item #6
Processing item #8
Processing item #7
Processing item #10
Processing item #9
--Token Canceled
Processing item #11
Processing item #12
Processing item #13
Processing item #14
Processing item #15
Processing item #16
Processing item #17
Processing item #19
Processing item #20
Processing item #18
The query was canceled
The query completes with an OperationCanceledException
, but not before processing all 20 items. The desirable behavior emerges when I uncomment the cts.Token.ThrowIfCancellationRequested();
line.
Output (desirable):
Processing item #2
Processing item #1
Processing item #3
Processing item #4
Processing item #5
Processing item #6
Processing item #7
Processing item #8
Processing item #9
Processing item #10
--Token Canceled
The query was canceled
Am I doing something wrong, or this is the by-design behavior of the ForAll
+WithCancellation
combination? Or it's a bug in the PLINQ library?
Upvotes: 2
Views: 311
Reputation: 56
You can use Select instead of ForAll
CancellationTokenSource cts = new CancellationTokenSource(1000);
cts.Token.Register(() => Console.WriteLine("--Token Canceled"));
try
{
Enumerable.Range(1, 20)
.AsParallel()
.WithDegreeOfParallelism(2)
.WithCancellation(cts.Token)
.Select(x =>
{
Console.WriteLine($"Processing item #{x}");
Thread.Sleep(200);
//cts.Token.ThrowIfCancellationRequested();
return true;
}).ToArray();
Console.WriteLine($"The query was completed successfully");
}
catch (OperationCanceledException)
{
Console.WriteLine($"The query was canceled");
}
Upvotes: 0
Reputation: 43485
Evk's answer explains thoroughly the observed behavior: the PLINQ operators check the cancellation token periodically, and not for each processed item. I searched for a way to alter this behavior, and I think that I found one. When the parallel query is enumerated with a foreach
loop, the cancellation token is checked on each iteration. So here is the solution that I came up with:
/// <summary>
/// Invokes in parallel the specified action for each element in the source,
/// checking the associated CancellationToken before invoking the action.
/// </summary>
public static void ForAll2<TSource>(this ParallelQuery<TSource> source,
Action<TSource> action)
{
foreach (var _ in source.Select(item => { action(item); return 0; })) { }
}
The Select
operator projects the ParallelQuery<TSource>
to a ParallelQuery<int>
with zero values, which is then enumerated with an empty foreach
loop. The action
is invoked in parallel as a side-effect of the enumeration.
Upvotes: 0
Reputation: 101473
It seems to be by design, but the logic is a bit different than you might expect. If we dig into source code a bit, we'll find related piece of ForAll
implementation here:
while (_source.MoveNext(ref element, ref keyUnused))
{
if ((i++ & CancellationState.POLL_INTERVAL) == 0)
_cancellationToken.ThrowIfCancellationRequested();
_elementAction(element);
}
So it does check for cancellation but not every iteration. If we check CancellationState.POLL_INTERVAL
:
/// <summary>
/// Poll frequency (number of loops per cancellation check) for situations where per-1-loop testing is too high an overhead.
/// </summary>
internal const int POLL_INTERVAL = 63; //must be of the form (2^n)-1.
// The two main situations requiring POLL_INTERVAL are:
// 1. inner loops of sorting/merging operations
// 2. tight loops that perform very little work per MoveNext call.
// Testing has shown both situations have similar requirements and can share the same constant for polling interval.
//
// Because the poll checks are per-N loops, if there are delays in user code, they may affect cancellation timeliness.
// Guidance is that all user-delegates should perform cancellation checks at least every 1ms.
//
// Inner loop code should poll once per n loop, typically via:
// if ((i++ & CancellationState.POLL_INTERVAL) == 0)
// _cancellationToken.ThrowIfCancellationRequested();
// (Note, this only behaves as expected if FREQ is of the form (2^n)-1
So basically PLINQ developers assume that you have a very fast code inside ForAll
(and similar methods), and as such they consider it wasteful to check for cancellation every iteration, so they check every 64 iterations. If you have long running code - you can check for cancellation yourself. I guess they had to do it like this because they can't do right thing for all situations in this case, however IF they checked every iteration - you would not be able to avoid the perfomance cost.
If you increase number of iterations in your code and adjust cancellation timeout - you'll see that indeed it will cancel after about 64 iterations (on each partition, so 128 total).
Upvotes: 3