Reputation: 285
Is there any way that I can run backfill sequentially without doing multitasking? E.g., if I run the backfill with several dates such as airflow backfill [dag] -s "2017-07-01" -e "2017-07-10", is there any way to finish every dag before running to the next day? Right now its finishing all days of each task before going to the next task.
Thanks.
Upvotes: 7
Views: 3936
Reputation: 2329
You can set the max_active_runs
parameter of your DAG to 1 which will make sure that only one DAG run for that dag will get scheduled at the same time. https://pythonhosted.org/airflow/code.html?highlight=concurrency#models
If you need your entire dag to be complete before moving forward you can add an ExternalTaskSensor
to the start of your DAG and a DummyOperator
collection task at the end. Then set the ExternalTaskSensor to trigger on the DummyOperator at the end of the previous run.
dag = DAG(dag_id='dag')
wait_for_previous_operator = ExternalTaskSensor(\
task_id='wait_for_previous',
external_dag_id='dag',
external_task_id='collection',
execution_delta=schedule_interval,
dag=dag)
collection_operator = DummyOperator(\
task_id='collection',
dag=dag)
wait_for_previous_operator.set_downstream(your_other_tasks_list)
collection_operator.set_upstream(your_other_tasks_list)
Upvotes: 4