Gregory Seidman
Gregory Seidman

Reputation: 95

Create Blobs with Filename and Data from SQL Rows

There is plenty of documentation on how to use Azure Data Factory to read data from blobs into SQL, and even documentation on how to dump the output of a query to a single blob. I'm trying to create one blob for each row in a table (on Azure SQL Server), named by one field and containing the data in another.

My table has a GUID id field and a nvarchar(max) data field (which contains JSON, though that's largely irrelevant). Suppose I have the following rows:

                 id                    |  data
---------------------------------------+----------
38b2f551-5f13-40ce-8512-c108a05ecd44   |  foo
4db5b25b-1194-44e9-a7b2-bc8889c32979   |  bar
2a3bd653-ce14-4bd2-9243-6923e97224c6   |  baz

I want the following blobs to be created:

https://mycontainer.blob.core.windows.net/myblobs/38b2f551-5f13-40ce-8512-c108a05ecd44
https://mycontainer.blob.core.windows.net/myblobs/4db5b25b-1194-44e9-a7b2-bc8889c32979
https://mycontainer.blob.core.windows.net/myblobs/2a3bd653-ce14-4bd2-9243-6923e97224c6

Their contents should be the associated data field, i.e. foo, bar, and baz, respectively.

Data Factory v2 - Generate a json file per row has an answer that will work, but it involves querying the database once to get all the ids, then N more times to get the data from each row. It seems like it should be possible to query just once for both fields and use one for filename and one for contents, but I haven't been able to figure out how.

Upvotes: 6

Views: 1010

Answers (3)

Trent Tamura
Trent Tamura

Reputation: 1145

This is a pretty simple feat natively in ADF:

  1. Use a Lookup Activity to get the row results of your Database Table
  2. Use a For Each to Iterate over each row from the Lookup
  3. Within the For Each have a Copy Data activity that copies from DB (as a passthrough really) to Blob Storage

Here are screenshots with comments to illustrate the exact steps and proof that it works:

1st, here is my mock table that is similar to your example, I have a pretend FileID, and Data field

Mock Table

2nd, I do a lookup on that table, see the details and output below Lookup Activity Example

3rd, we put in the For Each activity so that we can perform an action or group of actions per item in the array object

For each Example

4th, we go into the For each and create a Copy Data activity with a Source of Azure SQLDB and a Sink of Azure Blob

Source Details: Copy Data Source

Sink Details: Copy Data Sink

And here is what happens in my Blob Storage when I run this pipeline:

Proof it works

Let me know if you need anything else or have any questions, hope this helps! Remember to like, love, and follow :)

Upvotes: 2

MartinJaffer-MSFT
MartinJaffer-MSFT

Reputation: 713

Instead of using the copy activity directly, you could use a for-each activity to route each row of your sql lookup to a rest api call for the blob storage (https://learn.microsoft.com/en-us/rest/api/storageservices/put-blob). Or some other way of writing.

Upvotes: 0

Jay Gong
Jay Gong

Reputation: 23782

After studying the case Data Factory v2 - Generate a json file per row, i suggestion you don't stick to copy activity.Based on your description,you could consider below solution:

1.Use GetMetaData Avtivity which can be used to retrieve metadata of any data in Azure Data Factory.You could get all your blob files' metadata by using this code.

2.Use ForEach Activity to loop the file names.

3.In ForEach Activity,run Azure Function Activity to call Http Trigger Azure Function. Pass the file name to the Azure Function as parameter. In the function method, please use azure blob storage sdk to retrieve content by the file name. Then insert the total row into sql database by API.

Upvotes: 0

Related Questions