Monsta
Monsta

Reputation: 113

How to load multiple files with overlapping but dynamic schema columns using azure copy activity?

I have around 7-8 files with different but overlapping schemas. However i want the schema of the target (azure synapse table) to be the schema of the largest no of columns.

my files have - 61 columns, 93 columns and 96 columns so my target table will have 96 columns

but while loading the 61 column tables, i want the rest 30 something rows to be loaded as NULL. I am using data flow to load recursively using *.csv option.

Can anyone please let me know how to achieve this?

my sources are csv files in azure BLOB and my target is one single table with the schema of the 96 columns.

(one more catch is, just one of the files has two columns named different but values are same, i dont know how to map this else it will look like two more new column and we will end up having 98 columns)

also i cant do manual mapping because doing that for these many files would take a lot of time.

Please let me know. Thanks in advance

Upvotes: 1

Views: 1022

Answers (1)

Jay Gong
Jay Gong

Reputation: 23782

I try to implement your requirement with copy activity in ADF.

Prepare for the test data:

three files in blob storage container.

enter image description here

enter image description here

enter image description here

Column setting in SQL DB:

enter image description here

Then use Get Metadata Activity and For Each Activity in ADF:

enter image description here

Configuration for Get Metadata Activity:

enter image description here

Configuration for For Each Activity:

enter image description here

Inside For Each Activity ,please use copy activity. Set blob storage as source dataset and sql db(i use sql db for test,you could use synapse db here) as sink dataset.

enter image description here

Test Result in SQL DB:

enter image description here

Furthermore,to be honest,i didn't find any way to map just one of the files has two columns named different but values are same. I believe it's not supported by ADF so far. Since you definitely know which column matches this situation, you may deal with them using code later.(which i think it's not too tough)

Upvotes: 1

Related Questions