Reputation: 833
I have two dataset, one "FileShare" DS1 and another "BlobSource" DS2. I define a pipeline with one copy activity, which needs to copy the files from DS1 to DS3 (BlobSource), with dependency specified as DS2. The activity is specified below:
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "FileShare"
},
"sink": {
"type": "BlobSource"
}
},
"inputs": [
{
"name": "FoodGroupDescriptionsFileSystem"
},
{
"name": "FoodGroupDescriptionsInputBlob"
}
],
"outputs": [
{
"name": "FoodGroupDescriptionsAzureBlob"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst"
},
"scheduler": {
"frequency": "Minute",
"interval": 15
},
"name": "FoodGroupDescriptions",
"description": "#1 Bulk Import FoodGroupDescriptions"
}
Here, how can i specify multiple source type (both FileShare and BlobSource)? It throws error when i try to pass as list.
Upvotes: 0
Views: 1940
Reputation: 3253
The copy activity doesn't like multiple inputs or outputs. It can only perform a 1 to 1 copy... It won't even change the filename for you in the output dataset, never mind merging files!
This is probably intentional so Microsoft can charge you more for additional activities. But let's not digress into that one.
I suggest having 1 pipeline copying both files into some sort of Azure storage using separate activities (1 per file). Then have a second down stream pipeline that has a custom activity to read and merge/concatenate the files to produce a single output.
Remember that ADF isn't an ETL tool like SSIS. Its just there to invoke other Azure services. Copying is about a complex as it gets.
Upvotes: 1