Reputation: 31
Airflow DAG is triggered twice on Monday for below configurations.
When I use 30 11 * * 1
cron expression, DAG doesn't trigger at all. So figured out I have to add one more * to the expression.
30 11 * * 1 *
- It works.
default_args:
'start_date': airflow.utils.dates.days_ago(1)
DAG :
schedule_interval=30 11 * * 1 *
, ## This is weekly run on Monday at 11:30.
However, DAG is getting triggered 2 times every Monday. 1 min apart:
What could be the possible reason?
Upvotes: 1
Views: 3743
Reputation: 31
So finally, I figured out the issue.
Yes it is correct, 5 digit cron expression is correct.
I am using schedule_interval = 30 11 * * 1 #(Every Monday 11:30 UTC)
It wasn't working because I had my start_time :
'start_date': airflow.utils.dates.days_ago(1)
I found this blog on Airflow — Trick to find the exact [start_date] via CRON expression here!
If it's a weekly job, your start_date should be a week ago.
So I changed it to 'start_date': airflow.utils.dates.days_ago(7)
Now it is working fine.
Thank you!!!
Upvotes: 2
Reputation: 2780
The cron parser that airflow is using interprets the 6th place as seconds (as you can see here: https://github.com/kiorky/croniter/blob/master/src/croniter/tests/test_croniter.py#L14 ).
I'm assuming that your DAG finishes in under a minute. The next scheduler loop, it sees that the cron schedule still matches (on the 58th second), so it starts the DAG again.
I was having the same issue, because the Airflow documentation linked to a wikipedia entry about cron that showed 6 entries. 6 entries is non standard, and there is more than one implementation. Anyway, for Airflow, the 6th entry is interpreted as seconds.
Your 5 place cron expression should work. Maybe try again? However, change the dag id, or you may run into weird behaviour: From https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls : Changing schedule interval always requires changing the dag_id, because previously run TaskInstances will not align with the new schedule interval
Upvotes: 0
Reputation: 717
The 6 digits cron expression is incorrect, the first one you input is correct. How many times did you run the DAG?
I suggest you try to run schedule_interval=@weekly
first and see what happens ?
Upvotes: 0