3nomis
3nomis

Reputation: 1613

Create Azure Batch activity in Data factory

I want to create an Azure Batch Activity in my Data Factory Pipe, I set up a trigger that checks for new "last modified" blobs in the last 24 hrs.
As I'm dealing with big files I want to leverage the power of Azure Batch and multiprocess 2 blobs at a time in the same machine.
This is the pipe I've done so far:
enter image description here
The second activity manipulate the output of the previous one by creating a list variable of {container name}/{blob}.
How can I divide my blob addresses in little batches so that I can feed them to the next batch activity?
Thanks

Upvotes: 1

Views: 131

Answers (1)

wBob
wBob

Reputation: 14379

The 'ForEach' activity by default runs in parallel so it will spin up at least 20 threads by default and up to 50 depending on your input process. Make sure the 'Sequential' box on your ForEach is unchecked:

ForEach in parallel mode

If you need to group up into larger groups eg 3 per batch, 5 per batch then that could be a bit more tricky and I would be looking eg a Stored Proc activity, a Databricks notebook or a Synapse Notebook to do that slightly more complex work for me.

Upvotes: 1

Related Questions