xywz
xywz

Reputation: 45

Airflow not only runs a DAG Run for the most current instance of the DAG interval series but also the one before it when I set catchup=False

While I am trying to configure my airflow DAG, I wanted my airflow dag to run only once when I started airflow scheduler.However it runs most current instance of the dag with second most current instance although doc of airflow says:

quote If your DAG is written to handle its own catchup (IE not limited to the interval, but instead to “Now” for instance.), then you will want to turn catchup off (Either on the DAG itself with dag.catchup = False) or by default at the configuration file level with catchup_by_default = False. What this will do, is to instruct the scheduler to only create a DAG Run for the most current instance of the DAG interval series.

my dag configuration is down below:

default_args = {
    'owner':'airflow',
    'depends_on_past': False,
    'start_date':datetime(2019, 1, 1),
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(hours=4)
 }

dag = DAG('name', catchup=False, default_args=default_args, schedule_interval=timedelta(days=2))

Upvotes: 0

Views: 234

Answers (1)

brki
brki

Reputation: 2780

Sounds very much like this known bug: https://issues.apache.org/jira/browse/AIRFLOW-1156

Upvotes: 1

Related Questions