ChangeIsTheGame
ChangeIsTheGame

Reputation: 25

Dags not being picked up as per its schedule if the schedule interval is less than min_file_process_interval value in Airflow.cfg of v1.10.2

In airflow.cfg file, I configured min_file_process_interval value to be 120 seconds. The dag had schedule interval to run after every minute. But it is being scheduled only after every 120 seconds (as per min_file_process_interval value). Is this expected ?

I changed the min_file_process_interval to 200 seconds, then the dag schedule started being picked up after 200 seconds. Just to clarify,if the case is opposite, i.e. dag schedule interval is 2 minutes and min_file_process_interval is 1 minute, then the dag will run fine as per its schedule. Below is my dag:

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2019, 5, 14),
    'retries': 0,
}

dag = DAG('print', catchup=False, default_args=default_args, schedule_interval='*/1 * * * *')

t1 = BashOperator(
    task_id='print_date',
    bash_command='echo `date` , ',
    dag=dag)

Upvotes: 0

Views: 337

Answers (1)

AC at CA
AC at CA

Reputation: 735

Per airflow documents:

In cases where there are only a small number of DAG definition files, the loop could potentially process the DAG definition files many times a minute. To control the rate of DAG file processing, the min_file_process_interval can be set to a higher value. This parameter ensures that a DAG definition file is not processed more often than once every min_file_process_interval seconds.

It means airflow will process at every min_file_process_interval seconds. Thus, you should set your DAG schedule interval to be the multiples of min_file_process_interval.

Upvotes: 1

Related Questions