Reputation: 23
I currently have an Excel file that has multiple worksheets (over 11). This Excel file currently lives in a remote file server. I am trying to use Azure Data FactoryV2 to copy the Excel file and split each worksheet as its own .csv file within an ADLS Gen2 folder. The reason for this is because not every tab has the same schema and I want to only select the valid ones later.
I currently have an ADF dataset pointing to the Excel dataset correctly and have created a parameter for the sheet name using @dataset.SheetName. I am not sure where to go next. After creating a new pipeline I have tried nesting a Copy Activity inside a ForEach activity, however, it asks for the SheetName value.
How do I construct this pipeline to grab the names of the worksheets existing in the Excel file and then iterate a copy activity for each sheet? I cannot assume I will know the sheet names or how many sheets there will be. I would prefer to avoid creating multiple datasets for the Excel file if possible.
Any insight would be appreciated.
Upvotes: 2
Views: 6337
Reputation: 6043
Get list of Excel sheet names in ADF is not support yet and you can vote here.
import pandas
xl = pandas.ExcelFile('data.xlsx')
# see all sheet names
print(xl.sheet_names )
Upvotes: 1