Reputation: 163
I don't understand what the use is for specifying a DAG start_date
in the past. I've read about catchup and backfill but I still don't get it. In what context would I want to specify a start_date in the past?
Upvotes: 1
Views: 3364
Reputation: 1057
for a scheduled run, airflow scheduler
waits for the completion of interval time period before running your DAG.
for instance, say you want run your dag on monthly basis and scheduled it as 0 3 11 * *
, which means to run your dag at 3 AM on 11th day of the month.
Now, say you have deployed your dag on 10th day of January, 2021 then you would expect it to run on the next day. But In reality, airflow won't trigger your DAG till next month ie. 11th Feb,2021. So the airflow will wait for about one month before actually triggering your DAG that was supposed to run on 11th of Jan, 2021.
In this scenario, when you deploy your DAG you can mention your start_date
as 10th Dec, 2020 so that when the actual day (11th Jan,2021) comes, scheduler will mark as completion of interval time period and start your triggering your DAG.
for more reference, you can read up : https://www.astronomer.io/guides/scheduling-tasks
Upvotes: 5