Reputation: 1258
Hi I am using Azure Data Factory for a Copy activity. I want the copy to be recursive across a container and it subfolders as follows: myfolder/Year/Month/Day/Hour}/New_Generated_File.csv
The files that I am generating and importing into the folder have always a different name.
The problem is that activity seems to waiting for ever.
The pipeline is scheduled hourly.
I'm attaching the json code of the dataset and the linked service.
Dataset:
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/{Custom}.csv",
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
}
Linked Service:
{
"name": "LinkedService_To_Blob",
"properties": {
"description": "",
"hubName": "dataorchestrationsystem_hub",
"type": "AzureStorage",
"typeProperties": {
"connectionString": "DefaultEndpointsProtocol=https;AccountName=wizestorage;AccountKey=**********"
}
}
}
Upvotes: 1
Views: 3012
Reputation: 3004
It is not mandatory to give the file name in the dataset's folderPath
property. Just remove the file name and then all the files will be loaded by the datafactory for you.
{
"name": "Txns_In_Blob",
"properties": {
"structure": [
{
"name": "Column0",
"type": "String"
},
[....Other Columns....]
],
"published": false,
"type": "AzureBlob",
"linkedServiceName": "LinkedService_To_Blob",
"typeProperties": {
"folderPath": "uploadtransactional/yearno={Year}/monthno={Month}/dayno={Day}/hourno={Hour}/",
"partitionedBy": [
{ "name": "Year", "value": { "type": "DateTime", "date": "SliceStart", "format": "yyyy" } },
{ "name": "Month", "value": { "type": "DateTime", "date": "SliceStart", "format": "%M" } },
{ "name": "Day", "value": { "type": "DateTime", "date": "SliceStart", "format": "%d" } },
{ "name": "Hour", "value": { "type": "DateTime", "date": "SliceStart", "format": "hh" } }
],
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": " "
}
},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {}
}
With the above folderPath
it will generate the run time value
uploadtransactional/yearno=2016/monthno=05/dayno=30/hourno=07/
for a pipeline which executes UTC time zone now
Upvotes: 2