MrBronson
MrBronson

Reputation: 612

How to configure Airflow dag start_date to run tasks like in cron

I’m new to Airflow and I’m trying to understand how to use the scheduler correctly. Basically I want to schedule tasks the same way as I use cron. There’s a task that needs to be run every 5 minutes and I want it to start at the dag run next even 5 min slot after I add the DAG file to dags directory or after I have made some changes to the dag file.

I know that the DAG is run at the end of the schedule_interval. If I add a new DAG and use start_date=days_ago(0) then I will get the unnecessary runs starting from the beginning of the day. It also feels stupid to hardcode some specific start date on the dag file i.e. start_date=datetime(2019, 9, 4, 10, 1, 0, 818988). Is my approach wrong or is there some specific reason why the start_date needs to be set?

Upvotes: 0

Views: 6824

Answers (1)

MrBronson
MrBronson

Reputation: 612

I think I found an answer to my own question from the official documentation: https://airflow.apache.org/scheduler.html#backfill-and-catchup

By turning off the catchup, DAG run is created only for the most recent interval. So then I can set the start_date to anything in the past and define the dag like this:

dag = DAG('good-dag', catchup=False, default_args=default_args, schedule_interval='*/5 * * * *')

Upvotes: 5

Related Questions