Azure Data flow taking mins to trigger next pipeline

Question

Azure Data factory transferring data in Db in 10 millisecond but the issue I am having is it is waiting for few mins to trigger next pipeline and that ends up with 40 mins all pipelines are taking less than 20 ms to transfer data. But somehow it is waiting a few mins to trigger the next one.

I used debug mode as well trigger the ADF using Logic App without debugging mood. Is there any way I can optimize it we want to move from SSIS to Data Flow but having a time issue 40 mins are so much in next step we have millions of records

so it took 7 seconds to transfer data to dataBase but it waited for 6 mins :( check the image below

Leon Yue · Accepted Answer

This document Monitor data flow performance mentioned that:

Note that you can assume 1 minute of cluster job execution set-up time in your overall performance calculations and if you are using the default Azure Integration Runtime, you may need to add 5 minutes of cluster spin-up time as well.

That's maybe a reason. You can first follow this tutorial Mapping data flows performance and tuning guide.

This document Execute data flow activity in Azure Data Factory also can help us improve the performance.

Choose the compute environment for this execution of your data flow. The default is the Azure Auto-Resolve Default Integration Runtime. This choice will execute the data flow on the Spark environment in the same region as your data factory. The compute type will be a job cluster, which means the compute environment will take several minutes to start-up.

You have control over the Spark execution environment for your Data Flow activities. In the Azure integration runtime are settings to set the compute type (general purpose, memory optimized, and compute optimized), number of worker cores, and time-to-live to match the execution engine with your Data Flow compute requirements. Also, setting TTL will allow you to maintain a warm cluster that is immediately available for job executions.

Note:

The Integration Runtime selection in the Data Flow activity only applies to triggered executions of your pipeline. Debugging your pipeline with Data Flows with Debug will execute against the 8-core default Spark cluster.

Hope this helps.

Azure Data flow taking mins to trigger next pipeline

Answers (2)

Related Questions