Reputation: 21
I am new to Azure data Factory. I am trying to get data from an API using Web Activity and then will use some activity and then data flow and then some other activity and at end will store in BLOB. In this whole transformation Do I need to have any persistency layer like blob or sql database between the passes of activities and Data Flow. Or these activities can hold data. The data size may vary from MB to GB.
Upvotes: 0
Views: 778
Reputation: 11474
In this whole transformation Do I need to have any persistency layer like blob or sql database between the passes of activities and Data Flow
This depends on your requirement.
AFAIK, in ADF Web activity, lookup activity, dataflow sink cache, script activity and set variable activity (only array) will give the result as activity output. (Also, Notebook activity if we pass it as JSON dumps from notebook code)
And all will give the output as array of objects apart from web.
we can get the output of lookup using dynamic content @activity('activity name').output.value
For web activity it is @activity('activity name').output
. In this, it will give the output as JSON object.
In dataflow sink, use sink cache and check write to activity output. Then set the logging level to None in dataflow activity.
Check the data in dataflow activity output and give use appropriate dynamic content to get it.
You can get the Script activity output like this and the dynamic content according to it.
Set variable activity can be used to store these results as array of objects(to avoid complex dynamic content expression).
The data size may vary from MB to GB.
But all the above activities data size limit is 5000 rows only i.e., 4MB. So, you need intermediate storage SQL or blob when using more size.
Upvotes: 1