Reputation: 59
Environments
Scenario
I have ADF pipeline which reads the data from On premise server and writes the data to azure data lake.
For the same - I have provided Folder structure in ADF*(dataset)*as follows
Folder Path : - DBName/RawTables/Transactional
File Path : - TableName.csv
Problem
Is it possible to parameterized the Folder name or file path ? Basically - if tomorrow - I want to change the folder path*(without deployment)* then we should be updating the metadata or table structure.
Upvotes: 0
Views: 8233
Reputation: 565
What you want to do is possible with the new Lookup activity in Azure Data Factory V2. Documentation is here: Lookup Activity.
A JSON example would be something like this:
{
"name": "LookupPipelineDemo",
"properties": {
"activities": [
{
"name": "LookupActivity",
"type": "Lookup",
"typeProperties": {
"dataset": {
"referenceName": "LookupDataset",
"type": "DatasetReference"
}
}
},
{
"name": "CopyActivity",
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "select * from @{activity('LookupActivity').output.tableName}"
},
"sink": {
"type": "BlobSink"
}
},
"dependsOn": [
{
"activity": "LookupActivity",
"dependencyConditions": [ "Succeeded" ]
}
],
"inputs": [
{
"referenceName": "SourceDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "SinkDataset",
"type": "DatasetReference"
}
]
}
]
}
}
Upvotes: 0
Reputation: 446
Mangesh, why don't you try the .Net custom activity in ADF. This custom activity will be your first activity that will potentially check for the processed folder and if the processed folder is present it will move that to History(say) folder. As, ADF is a platform for data movement and data transformation, it doesn't deal with the IO activity. You can learn more about the .Net custom activity at:
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-use-custom-activities
Upvotes: 0
Reputation: 3253
So the short answer here is no. You won't be able to achieve this level of dynamic flexibility with ADF on its own.
You'll need to add new defined datasets to your pipeline as inputs for folder changes. In Data Lake you could probably get away with a single stored procedure that accepts a parameter for the file path which could be reused. But this would still require tweaks to the ADF JSON when calling the proc.
Of course, the catch all situation here is to use an ADF custom activity and write a C# class with methods to do whatever you need. Maybe overkill though and lots of effort to setup the authentication to data lake store.
Hope this gives you a steer.
Upvotes: 1