Reputation: 1
Is there any way we can fetch the max of last modified date from the last processed file and store it in a config table
Upvotes: 0
Views: 2884
Reputation: 5034
From Supported data stores and formats you can see Salesforce, Salesforce service cloud and Marketing cloud are supported.
You have to perform the following steps:
Follow this to setup Linked Service with Salesforce in Azure Data Factory
When copying data from Salesforce, you can use either SOQL query or SQL query. Note that these two has different syntax and functionality support, do not mix it. You are suggested to use the SOQL query, which is natively supported by Salesforce.
Process for incremental loading, or delta loading, of data through a watermark:
In this case, you define a watermark in your source database. A watermark is a column that has the last updated time stamp or an incrementing key. The delta loading solution loads the changed data between an old watermark and a new watermark. The workflow for this approach is depicted in the following diagram:
ADF will scan all the files from the source store, apply the file filter by their LastModifiedDate, and only copy the new and updated file since last time to the destination store.
For capabilities, Prerequisites and Salesforce request limits refer Copy data from and to Salesforce by using Azure Data Factory
Refer doc: Delta copy from a database with a control table This article describes a template that's available to incrementally load new or updated rows from a database table to Azure by using an external control table that stores a high-watermark value.
This template requires that the schema of the source database contains a timestamp column or incrementing key to identify new or updated rows.
The template contains four activities:
Go to the Delta copy from Database template. Create a New connection to the source database that you want to data copy from.
Create connections to the external control table and stored procedure that you created and Select Use this template.
Choose the available pipeline
For Stored procedure name, choose [dbo].[update_watermark]. Select Import parameter, and then select Add dynamic content.
click Add dynamic content and Type in below query. This will get a maximum date in your watermark column that we can use for delta slice.
You can use this query to fetch the max of last modified date from the last processed file
select MAX(LastModifytime) as NewWatermarkvalue from data_source_table"
or
For files only you can use Incrementally copy new and changed files based on LastModifiedDate by using the Copy Data tool
Refer: Source: ADF Incremental loading with configuration stored in a table
Upvotes: 2