Reputation: 57
I am quite new to using Azure so pardon me for this basic question. I am trying to import a scheduled Excel file that is sent everyday to my email address into my Azure SQL database. So far, I have managed to use Azure Logic App to save the file into a blob. I intend to use Azure Data Factory to copy the blob into my Azure SQL Database (as after I import in, I will be using ADF to do incremental load).
However, my Excel is generally in this table format (this is system generated so I have no control to edit the source). The problem comes when my source has duplicated column headings such as 'Name', but the data itself is not duplicated; they are representing different things. Eg. 'Name' can refer to Country and another column 'Name' refers to Continent. Note: I am able to differentiate it on my end to know which column index to match to which in my database.
What I have done : I have created a dataset in ADF without first row as header (as it will cause problems) and then created a dataflow to filter out the column headers before mapping my source to my sink successfully.
What I am looking for is perhaps is there a more elegant solution to this? I have read that it is possible to change the column header names of my source using import schema when creating the dataset. But when I tried to import an excel as the schema with the column headers I want, this error comes out (Assuming I am trying to import one of the 'Names' as 'Country').
ErrorCode=ExcelInvalidColumnName,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Invalid column name 'Country' is specified while read data from worksheet 'Page 1'.,Source=Microsoft.DataTransfer.ClientLibrary,'
Update: Here's the screenshot of the copy data activity where the mapping source column is empty and I have to individually select several duplicated column names
Upvotes: 0
Views: 2078
Reputation: 11
to add on this regarding copyactivity-behaviour with duplitate column names in source: also the "mapping" section numbers the duplicate columns (starting with 0) and also the datset preview does so.. you CAN'T use "auto-create table" in SINK of copy activity because strangely the generater for auto-create table seems to apply not the "numbering-logic" for dupliate column-names
Upvotes: 0
Reputation: 986
You can just use a Copy Data activity.
Let it pull in the first row with the headers (I made my csv have several columns called thing
).
Then on the mapping tab of Copy Data, click Import Schemas
It will assign unique names to your duplicate column headings, and you can over-type the default output column names like this ...
Upvotes: 0