Reputation: 95
There is plenty of documentation on how to use Azure Data Factory to read data from blobs into SQL, and even documentation on how to dump the output of a query to a single blob. I'm trying to create one blob for each row in a table (on Azure SQL Server), named by one field and containing the data in another.
My table has a GUID id
field and a nvarchar(max) data
field (which contains JSON, though that's largely irrelevant). Suppose I have the following rows:
id | data
---------------------------------------+----------
38b2f551-5f13-40ce-8512-c108a05ecd44 | foo
4db5b25b-1194-44e9-a7b2-bc8889c32979 | bar
2a3bd653-ce14-4bd2-9243-6923e97224c6 | baz
I want the following blobs to be created:
https://mycontainer.blob.core.windows.net/myblobs/38b2f551-5f13-40ce-8512-c108a05ecd44
https://mycontainer.blob.core.windows.net/myblobs/4db5b25b-1194-44e9-a7b2-bc8889c32979
https://mycontainer.blob.core.windows.net/myblobs/2a3bd653-ce14-4bd2-9243-6923e97224c6
Their contents should be the associated data field, i.e. foo
, bar
, and baz
, respectively.
Data Factory v2 - Generate a json file per row has an answer that will work, but it involves querying the database once to get all the id
s, then N more times to get the data from each row. It seems like it should be possible to query just once for both fields and use one for filename and one for contents, but I haven't been able to figure out how.
Upvotes: 6
Views: 1010
Reputation: 1145
This is a pretty simple feat natively in ADF:
Here are screenshots with comments to illustrate the exact steps and proof that it works:
1st, here is my mock table that is similar to your example, I have a pretend FileID, and Data field
2nd, I do a lookup on that table, see the details and output below
3rd, we put in the For Each activity so that we can perform an action or group of actions per item in the array object
4th, we go into the For each and create a Copy Data activity with a Source of Azure SQLDB and a Sink of Azure Blob
And here is what happens in my Blob Storage when I run this pipeline:
Let me know if you need anything else or have any questions, hope this helps! Remember to like, love, and follow :)
Upvotes: 2
Reputation: 713
Instead of using the copy activity directly, you could use a for-each activity to route each row of your sql lookup to a rest api call for the blob storage (https://learn.microsoft.com/en-us/rest/api/storageservices/put-blob). Or some other way of writing.
Upvotes: 0
Reputation: 23782
After studying the case Data Factory v2 - Generate a json file per row, i suggestion you don't stick to copy activity.Based on your description,you could consider below solution:
1.Use GetMetaData Avtivity which can be used to retrieve metadata of any data in Azure Data Factory.You could get all your blob files' metadata by using this code.
2.Use ForEach Activity to loop the file names.
3.In ForEach Activity,run Azure Function Activity to call Http Trigger Azure Function. Pass the file name to the Azure Function as parameter. In the function method, please use azure blob storage sdk to retrieve content by the file name. Then insert the total row into sql database by API.
Upvotes: 0