agw2021
agw2021

Reputation: 344

Creating an alert for long running pipelines

I currently have an alert setup for Data Factory that sends an email alert if the pipeline runs longer than 120 minutes, following this tutorial: https://www.techtalkcorner.com/long-running-azure-data-factory-pipelines/. So when a pipeline does in fact run longer than the expected time, I do receive an alert however, I am also getting additional & unexpected alerts.

My query looks like:

 ADFPipelineRun
 | where Status =="InProgress" // Pipeline is in progress
 | where RunId !in (( ADFPipelineRun | where Status in ("Succeeded","Failed","Cancelled") | project RunId ) ) // Subquery, pipeline hasn't finished
 | where datetime_diff('minute', now(), Start) > 120 // It has been running for more than 120 minutes

I received an alert email on September 28th of course saying a pipeline was running longer than the 120 minutes but when trying to find the pipeline in the Azure Data Factory pipeline runs nothing shows up. In the alert email there is a button that says, "View the alert in Azure monitor" and when I go to that I can then press "View Query Results" above the shown query. Here I can re-enter the query above and filter the date to show all pipelines running longer than 120 minutes since September 27th and it returns 3 pipelines.

Something I noticed about these pipelines is the end time column:

I'm thinking that at some point the UTC time is not properly configured and for that reason, maybe the alert is triggered? Is there something I am doing wrong, or a better way to do this to avoid a bunch of false alarms?

Upvotes: 0

Views: 2596

Answers (2)

Nicholas
Nicholas

Reputation: 632

I'm not sure if you're seeing false alerts. What you've shown here looks like the correct behavior.

You need to keep in mind:

  1. Duration threshold should be offset by the time it takes for the logs to appear in Azure Monitor.
  2. The email alert takes you to the query that triggered the event. Your query is only showing "InProgress" statues and so the End property is not set/updated. You'll need to extend your query to look at one of the other statues to see the actual duration.

Run another query with the RunId of the suspect runs to inspect the durations.

ADFPipelineRun 
| where RunId  == 'bf461c8b-0b1e-43c4-9cdf-7d9f7ccc6f06' 
| distinct TimeGenerated, OperationName, RunId, Start, End, Status

For example:

enter image description here

Upvotes: 0

Abhishek Khandave
Abhishek Khandave

Reputation: 3230

To create Preemptive warnings for long-running jobs.

Create activity.

Click on blank space.

Follow path: Settings > Elapsed time metric

enter image description here

Refer Operationalize Data Pipelines - Azure Data Factory

Upvotes: 1

Related Questions