Reputation: 1
I am running a workflow with the schedule_interval
of 15mins. Now after completing the execution of the workflow, I have modified the configuration of the workflow and updated the schedule_interval
to 10 mins.
In this case it is creating some instances based on the difference between the last execution date and current start date and in those instances the task are not getting started, just the dag is shown in running state constantly.
How can I restrict these extra instances or can make them failed by default?
Let the Start date
of the DAG is 29/7/2019T12:00PM
Schedule Interval
is 15min.
Now let after running two instances at 12:32PM I have update the workflow schedule_interval
to 10 mins and the start_date
to current date i.e. 29/7/2019T12:32PM.
In this case as the last execution date
of the dag was 29/7/2019T12:30PM.
No according to the last execution date
and the updated schedule_interval
the next instance should run at 12:40PM, but next execution date
according to the previous schedule_interval
is 12:45.
So it won't run this instance. and will give dependency error saying:
your execution date is less than the start date.
Upvotes: 0
Views: 1300
Reputation: 11607
The fool-proof way of dodging such misbehaviours by scheduler
is to not confuse it, i.e. do NOT change the schedule_interval
of a DAG. In other words, if you have to to modify schedule interval of a DAG, just rename it; which will make airflow forget the old DAG and treat the renamed DAG as a brand new one
But my anecdotal tip is that if you ever happen to (say accidently) modify the schedule interval of a DAG, restarting the Airflow scheduler / webserver processes also solves the problem.
Upvotes: 5