user2339814
user2339814

Reputation: 405

TPL Data Flow Optimization

I have a TPL dataflow set up in this manner:

  1. Download byte array
  2. Process data
  3. Stream processed data to another location

This flow works great, but occasionally it runs into a backup while downloading file, connection hiccup, etc. What I'd like to do is have parallel downloads but still ensure that step 3 is executed such that the receiving party gets the payload in the correct order.

var broadcaster = new BroadcastBlock<string>(d => d);


var downloader = new TransformBlock<string, byte[]>(async data => {
  // Download and return data       
});


var processor = new TransformBlock<byte[], byte[]> (async data => {
  // Process and return data
});


var uploader = new ActionBlock<byte[]>(async input => {
  // Upload file to another location
});


broadcaster.LinkTo(downloader);
downloader.LinkTo(processor);
processor.LinkTo(uploader);


broadcaster.SendAsync("http://someUrl");
broadcaster.SendAsync("http://someOtherUrl")

So in the above code snippet, I'd want the two urls to download simultaneously, but it's important that the first one gets processed by the uploader before the second url. Can somebody point me in the correct direction?

Upvotes: 2

Views: 567

Answers (1)

svick
svick

Reputation: 244777

I'd want the two urls to download simultaneously, but it's important that the first one gets processed by the uploader before the second url

Then just set MaxDegreeOfParallelism on that block and it will behave like this. When URLs 1 and 2 are being downloaded simultaneously and 2 completes before 1 does, it will still wait for 1 to complete before 2 is sent to the following block.

This might not be the most efficient approach, but it does ensure that the order of processing is maintained across all blocks in a pipeline.

Upvotes: 3

Related Questions