Nadhas
Nadhas

Reputation: 5807

How to skip already copied files in Azure data factory, copy data tool?

I want to copy data from blob storage(parquet format) to cosmos db. Scheduled the trigger for every one hour. But all the files/data getting copied in every run. how to skip the files that are already copied?

There is no unique key with the data. We should not copy the same file content again.

Upvotes: 1

Views: 1753

Answers (1)

Jay Gong
Jay Gong

Reputation: 23792

Based on your requirements, you could get an idea of modifiedDatetimeStart and modifiedDatetimeEnd properties in Blob Storage DataSet properties.

enter image description here

But you need to modify the configuration of dataset every period of time via sdk to push the value of the properties move on.

Another two solutions you could consider :

1.Using Blob Trigger Azure Function.It could be triggered if any modifications on the blob files then you could transfer data from blob to cosmos db by sdk code.

2.Using Azure Stream Analytics.You could configure the input as Blob Storage and output as Cosmos DB.

Upvotes: 1

Related Questions