Reputation: 1
My Problem: I have a Data Lake Gen 2 Storage Account with an import directory which contains .xlsx files.
Now im trying to create a Dataset pointing to this directory. The Dir will contain multiple .xlsx files and also an archive directory, where processed .xlsx files will be moved to.
I want to point the data set specifically into the import folder and not into the import/archive folder - from what I've read i should use a wildcard like *.xlsx in the import dir.
However, I cannot get the dataset to work with the Wildcard, when I point it directly to the FileName.xlsx file its no problem:
working dataset pointing directly to one file
what am I doing wrong?
I tried to write the Sheet name manually and also tried to use the sheet index 0 manually, both give me an error:
ADLS Gen2 operation failed for: Operation returned an invalid status code 'NotFound'. Account: '********'. FileSystem: ''. Path: 'ingress/industrysectors/import/*.xlsx'. ErrorCode: 'PathNotFound'. Message: 'The specified path does not exist.'.
when removing the wildcard '*.xlsx' the file gets found, however that also means that in the future the .xlsx files in the import/archive folder will also be considered in the data set.
Upvotes: 0
Views: 104
Reputation: 5317
ADLS Gen2 operation failed for: Operation returned an invalid status code 'NotFound'. Account: '********_********'. FileSystem: '********_'. Path: 'ingress/industrysectors/import/*.xlsx'. ErrorCode: 'PathNotFound'. Message: 'The specified path does not exist.'
According to your information you are giving wild card path at dataset level, that may be the reason to get above error. Instead of that give container name at dataset level as shown below:
Go to copy activity add above dataset as source, select wild card path give directory and .xlsx
as shown below:
Give required sink, debug the pipeline, all xlsx files will copy to successfully without any error to the required sink as shown below:
Upvotes: 0