Gave Drohl
Gave Drohl

Reputation: 93

Many to Many TPL Dataflow does not process all inputs

I have a TPL Datalow pipeline with two sources and two targets linked in a many-to-many fashion. The target blocks appear to complete successfully, however, it usually drops one or more inputs. I've attached the simplest possible full repro I could come up with below. Any ideas?

Notes:

  1. The problem only occurs if the artificial delay is used while generating the input.
  2. Complete() is successfully called for both sources, but one of the source's Completion task hangs in the WaitingForActivation state, even though both Targets complete successfully.
  3. I can't find any documentation stating many-to-many dataflows aren't supported, and this question's answer implies it is - https://social.msdn.microsoft.com/Forums/en-US/19d831af-2d3f-4d95-9672-b28ae53e6fa0/completion-of-complex-graph-dataflowgraph-object?forum=tpldataflow
using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;

class Program
{
    private const int NumbersPerSource = 10;
    private const int MaxDelayMilliseconds = 10;

    static async Task Main(string[] args)
    {
        int numbersProcessed = 0;

        var source1 = new BufferBlock<int>();
        var source2 = new BufferBlock<int>();

        var target1 = new ActionBlock<int>(i => Interlocked.Increment(ref numbersProcessed));
        var target2 = new ActionBlock<int>(i => Interlocked.Increment(ref numbersProcessed));

        var linkOptions = new DataflowLinkOptions() { PropagateCompletion = true };
        source1.LinkTo(target1, linkOptions);
        source1.LinkTo(target2, linkOptions);
        source2.LinkTo(target1, linkOptions);
        source2.LinkTo(target2, linkOptions);

        var task1 = Task.Run(() => Post(source1));
        var task2 = Task.Run(() => Post(source2));

        // source1 or source2 Completion tasks may never complete even though Complete is always successfully called.
        //await Task.WhenAll(task1, task2, source1.Completion, source2.Completion, target1.Completion, target2.Completion);
        await Task.WhenAll(task1, task2, target1.Completion, target2.Completion);

        Console.WriteLine($"{numbersProcessed} of {NumbersPerSource * 2} numbers processed.");
    }

    private static async Task Post(BufferBlock<int> source)
    {
        foreach (var i in Enumerable.Range(0, NumbersPerSource)) {
            await Task.Delay(TimeSpan.FromMilliseconds(GetRandomMilliseconds()));
            Debug.Assert(source.Post(i));
        }
        source.Complete();
    }

    private static Random Random = new Random();

    private static int GetRandomMilliseconds()
    {
        lock (Random) {
            return Random.Next(0, MaxDelayMilliseconds);
        }
    }
}

Upvotes: 2

Views: 322

Answers (1)

Theodor Zoulias
Theodor Zoulias

Reputation: 43683

As @MikeJ pointed out in a comment, linking the blocks using the PropagateCompletion in a many-to-many dataflow configuration can cause the premature completion of some target blocks. In this case the target1 and target2 are both marked as completed when any of the two source blocks completes, leaving the other source unable to complete, because there are still messages in it's output buffer. These messages are never going to be consumed, because none of the linked target blocks is willing to accept them.

To fix this problem you could use the custom PropagateCompletion method below:

public static void PropagateCompletion(IDataflowBlock[] sources,
    IDataflowBlock[] targets)
{
    // Arguments validation omitted
    Task allSourcesCompletion = Task.WhenAll(sources.Select(s => s.Completion));
    ThreadPool.QueueUserWorkItem(async _ =>
    {
        try { await allSourcesCompletion.ConfigureAwait(false); } catch { }

        Exception exception = allSourcesCompletion.IsFaulted ?
            allSourcesCompletion.Exception : null;

        foreach (var target in targets)
        {
            if (exception is null) target.Complete(); else target.Fault(exception);
        }
    });
}

Usage example:

source1.LinkTo(target1);
source1.LinkTo(target2);
source2.LinkTo(target1);
source2.LinkTo(target2);
PropagateCompletion(new[] { source1, source2 }, new[] { target1, target2 });

Notice that no DataflowLinkOptions are passed when linking the sources to the targets in this example.

Upvotes: 4

Related Questions