Koushik
Koushik

Reputation: 11

Incremental loading of files from On-prem file server to Azure Data Lake

We would like to do incremental loading of files from our on-premises file server to Azure Data Lake using Azure Data Factory v2.

Files are supposed to store on daily basis in the on-prem fileserver and we will have to run the ADFv2 pipeline on regular intervals during the day and only the new un-processed files from the folder should be captured.

Upvotes: 0

Views: 335

Answers (2)

DraganB
DraganB

Reputation: 1138

In the source dataset, you can do file filter.You can do that by time for example (calling datetime function in expression language) or something else what will define new file. https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions Then with a scheduled trigger, you can execute pipeline n times during the day.

Upvotes: 0

ShirleyWang-MSFT
ShirleyWang-MSFT

Reputation: 266

Our recommendation is to put the set of files for daily ingestion into /YYYY/MM/DD directories. You can refer to this example on how to use system variables (@trigger().scheduledTime) to read files from the corresponding directory:

https://learn.microsoft.com/en-us/azure/data-factory/how-to-read-write-partitioned-data

Upvotes: 0

Related Questions