Reputation: 341
I want to get the column names for a parquet file. I have a Get Metadata module in my pipeline and it is using a parquet dataset with only the root folder provided. Because only the folder is provided ADF is not letting me get the file structure that contains the column names. The file name is not provided because that can change. Can anyone provide some advice on how to approach this?
Upvotes: 3
Views: 7523
Reputation: 5074
You will need 2 Get Metadata
activities and a ForEach
activity to get the file structure if your file name is not the same every time.
Source dataset:
Parameterize the file name as the name changes frequently.
Preview of source data:
Get Metadata1:
Get Metadata
activity, get the file name dynamically.Output of Get Metadata1: Get the file name from the folder.
FoEach activity:
Using the ForEach
activity, you can get the item's name listed inside the Get Metadata activity output array.
Get Metadata2:
Add Get Metadata
activity inside ForEach
activity to get the file structure or column list of the current file from the folder. It can loop the number of items count in the folder (1 or more).
Output of Get Metadata2:
Upvotes: 2
Reputation: 4945
You can parameterize your file name in dataset or via GetMeta data activity, get the list of files within the folder and then via GetMetaData activity get the list of columns for those corresponding files.
Upvotes: 0