user2845758
user2845758

Reputation: 21

Azure data factory Pass data between activities without storing anywhere

I am new to Azure data Factory. I am trying to get data from an API using Web Activity and then will use some activity and then data flow and then some other activity and at end will store in BLOB. In this whole transformation Do I need to have any persistency layer like blob or sql database between the passes of activities and Data Flow. Or these activities can hold data. The data size may vary from MB to GB.

Upvotes: 0

Views: 778

Answers (1)

Rakesh Govindula
Rakesh Govindula

Reputation: 11474

In this whole transformation Do I need to have any persistency layer like blob or sql database between the passes of activities and Data Flow

This depends on your requirement.

AFAIK, in ADF Web activity, lookup activity, dataflow sink cache, script activity and set variable activity (only array) will give the result as activity output. (Also, Notebook activity if we pass it as JSON dumps from notebook code)

And all will give the output as array of objects apart from web.

  • we can get the output of lookup using dynamic content @activity('activity name').output.value

  • For web activity it is @activity('activity name').output. In this, it will give the output as JSON object.

    enter image description here

  • In dataflow sink, use sink cache and check write to activity output. Then set the logging level to None in dataflow activity.

    enter image description here

    enter image description here

    Check the data in dataflow activity output and give use appropriate dynamic content to get it.

    enter image description here

  • You can get the Script activity output like this and the dynamic content according to it.

  • Set variable activity can be used to store these results as array of objects(to avoid complex dynamic content expression). enter image description here

The data size may vary from MB to GB.

But all the above activities data size limit is 5000 rows only i.e., 4MB. So, you need intermediate storage SQL or blob when using more size.

Upvotes: 1

Related Questions