Reputation: 23
I intend to export a list of files in a ADLS folder to a CSV file using Azure DataFactory.
For instance, I have the following folder structure.
ADLS > FOLDER1 > File1
File2
File3
Now, I am using the Get Metadata activity in Azure DataFactory to get the child items.
{
"childItems": [
{
"name": "DemoFile1",
"type": "File"
},
{
"name": "DemoFile2",
"type": "File"
} ],
"effectiveIntegrationRuntime": "AutoResolveIntegrationRuntime",
"executionDuration": 0
}
I want to export this output into a CSV file. Is there a way?
Upvotes: 1
Views: 2277
Reputation: 5074
There are a couple of options to store the list of source file names into CSV.
Option1:
As mentioned by @Nick.McDermaid in the comments section, you can use flatten transformation in data flow bypassing the get metadata output to data flow activity in the pipeline as I have repro’d in my lab.
Input:
Get Metadata
activity get the list of files from a folder.ForEach
activity. Add append variable activity inside Foreach activity to add all the file names to an array variable@activity('Get Metadata1').output.childItems
@item().name
@join(variables('file_list'),',')
Set variable output:
data flow
to flatten and add it after the set variable.Data Flow:
derived column
transformation to convert the string parameter value to an array. split($get_metadata_output, ',')
flatten
transformation after derived column and flatten the metadata column which is created in the derived column. Add mappings of metadata column under Input columns (this removes any columns extracted from dummy file).Pipeline:
Output:
Option2:
Input:
data flow
source to source dataset and provide source folder path from which folder you want to get the list of files. Do not provide the file name. In this way, it pulls all files data at once.Select
transformation to the source output to remove all the other columns from the source except the file_name.derived column
after select transformation to extract file name from the path. (column which stored file name contains the full path of the file)reverse(dropRight(reverse(File_name),instr(reverse(File_name),'/')-2))
aggregate
transformation to get the distinct record values from the File_name column.Sink preview:
Upvotes: 3