Imran
Imran

Reputation: 17

How to load files from blob to sql dw by using azure data factory?

I have tried so many ways to load data from :

  1. Azure blob to azure SQL synapse.

    My requirement is :
    Description:

     (Input)Blob storage --->  Azure sql synapse(Output)
                    emp_dummy.csv----> emp_dummy table
                    dept_dummy.csv -----> dept_dummy table
                    sales_dummy.csv-----> sales_dummy table   and so on
    

...

We have files starting with different names but the format is .csv only. I have been trying this in various ways by using getmetadata activity or lookup activity.

When I tried with the below activity, facing the error: [ADF pipeline][1] [1]: https://i.sstatic.net/RynIb.png Error:

{
    "errorCode": "2200",
    "message": "ErrorCode=UserErrorMissingPropertyInPayload,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Required property 'fileName' is missing in payload.,Source=Microsoft.DataTransfer.ClientLibrary,'",
    "failureType": "UserError",
    "target": "Copy data1",
    "details": []
}

I hope, I mention all details, if need some more, let me know.

Upvotes: 0

Views: 1394

Answers (2)

ASH
ASH

Reputation: 20312

I did a Google search for you. I found several really bad tutorials out there. The two links below looks pretty darn on point.

https://intellipaat.com/blog/azure-data-factory-tutorial/

https://medium.com/@adilsonbna/using-azure-data-lake-to-copy-data-from-csv-file-to-a-sql-database-712c243db658

Remember, when you're copying data from file stores by using Azure Data Factory, you can now configure wildcard file filters to let Copy Activity pick up only files that have the defined naming pattern—for example, "*.csv" or "???20180504.json".

For reference, look at the image below.

enter image description here

If you wanted to iterate through all the files, in different folders, in a Blob environment, instead of setting the File or Folder to this:

adfv2/SalesJan2009.csv

You can set the File or Folder to this:

adfv2/Sales*2009.csv

That will merge all Sales data from 2009 into a single dataframe, which you can them load to SQL Server (Data Warehouse, Synapse, etc).

Upvotes: 1

Leon Yue
Leon Yue

Reputation: 16411

I figured it out.

Here's my example steps: load two csv files to ADW, and auto create table with the same name with csv filename .

Csv files in blob storage:

enter image description here

Get all the filename in the blob container 'backup':

enter image description here

enter image description here

Foreach item settings:

@activity('Get Metadata2').output.childItems

enter image description here

Copy active in Foreach:

In copy active, using another blob source, add parameter to choose the file: enter image description here enter image description here

Source settings:

enter image description here

Sink dataset(ADW): enter image description here enter image description here

Sink settings:

enter image description here

table name expression: @split(item().name, '.')[0]

Note: get metadata will get the full file name like 'test.csv', when we set the table name, we need split it and set table name as 'test'.

Execute pipeline: enter image description here

Check data in ADW: enter image description here

Hope this helps.

Upvotes: 1

Related Questions