Martin Hansen
Martin Hansen

Reputation: 2101

Airflow: How to start operators in parallel after first operator has finished

Currently i have a DAG consisting of 4 operators as shown below:

with DAG('dag', default_args=args, schedule_interval=schedule_interval, catchup=True) as dag:
main_dag = PythonOperator(
    task_id='1',
    python_callable=func,
    provide_context=True,
    dag=dag)

run_after_main_dag_1 = PythonOperator(
    task_id='1',
    python_callable=foo,
    provide_context=True,
    dag=dag)

run_after_main_dag_2 = BranchPythonOperator(
    task_id='2',
    python_callable=foo,
    provide_context=True)

run_after_main_dag_2_2 = PythonOperator(
    task_id='3',
    python_callable=foo,
    provide_context=False,
    dag=dag)

#this runs sequential, but shouldn't.
main_dag >> run_after_main_dag_1 >> run_after_main_dag_2 >> run_after_main_dag_2_2

Here's what i'd like to achieve:

  1. Run main_dag operator

  2. Once main_dag is finished, start run_after_main_dag_1 and run_after_main_dag_2 in parallel, as they are not independent of each other.

I simply can't find how to achieve this in the docs anywhere. There must be a simple syntax i have completely overlooked.

Anyone who knows how to make it happen?

Upvotes: 2

Views: 1850

Answers (2)

mad_
mad_

Reputation: 8273

In Airflow >> and << are used to set up the downstream and upstream dependency.

You code

main_dag >> run_after_main_dag_1 >> run_after_main_dag_2 >> run_after_main_dag_2_2 #sequentially

It is actually defining the relationship that runs sequentially as run_after_main_dag_1's upstream is set to main_dag and so on.

In order to separate run_after_main_dag_1 and run_after_main_dag_2 you can define relationship such that both have upstream task as main_dag

main_dag >> run_after_main_dag_1 # It is just dependent on main_dag
main_dag >> run_after_main_dag_2 # It is just dependent on main_dag

It will then kick off the two tasks in parallel once the main_dag task finish its execution

Upvotes: 0

Martin Hansen
Martin Hansen

Reputation: 2101

So there was a simple answer:

main_dag >> run_after_main_dag_1
main_dag >> run_after_main_dag_2 >> run_after_main_dag_2_2

Upvotes: 1

Related Questions