Padmavathi Battu
Padmavathi Battu

Reputation: 1

How to migrate data from one cosmos db to another cosmos db using ADF with copy task and some conditions

I have two Azure Cosmos dbs, I need to migrate data from one db to another db. I have 1 container each in source and sink which has ids and both side data is similar (some has latest data). I would like to migrate data from source to sink. Now i have two cases

first i need to check the source ids with sink side ids and if id exists below actions needs to be taken

  1. If Id does not exist then copy data with insert to sink
  2. If the id exists in both sides(Source & sink) then check for latest using if source timestamp> sink timestamp then upsert to sink
  3. If ID exists and source timestamp < sink timestamp then do nothing How can i achieve this in ADF pipeline. Currently i have only a copy task which inserts/upserts all the data to sink. I would like to do this operation based on above conditions.

sink has only

Sink image

update or insert but not how to filter this condition in ADF pipeline

Upvotes: 0

Views: 643

Answers (1)

Aswin
Aswin

Reputation: 7156

Upsert option in ADF copy activity will insert if unique key column value is not there in sink and it will update the sink, when key column value is there in the sink. It will not check if there is change in the source data and sink data. It will just update when key columns are matching between source and sink data store. For your question, you can filter the latest records only from the source and then copy to the sink container. This will make sure only changed records or new records are getting copied. Below is the change that you want to do in ADF copy activity.

  • In copy activity source settings, you can select Query option in the use query buttons.

  • In the query text box, click on add dynamic content and add the query

select * from c where c.timestamp_column > @{variables('date')}

  • In this example, I have stored the latest date of sink datastore in variable named date. You can store that value in some datastore and then use it in this query also.

This way , only latest records will be moved for upsert operation.

Upvotes: 0

Related Questions