zorrrba
zorrrba

Reputation: 65

Airflow how to run parallel tasks

I'm trying to run multiple tasks in parallel on a single EC2 instance. I've set up a psql backend and switched to using local executor in addition to changing the configs in the config file. However, my tasks seem to be executing sequentially rather than in parallel. Are you theoretically able to run multiple tasks in parallel with the PythonOperator? Ideally, I would expect to see alternating Hi and Hello's when I run this task but instead I get 10 Hello's and then 10 Hi's. If I can get this toy example working, it will really help with a workflow I'm trying to design.

    "test-airflow",
    default_args=default_args,
    description="airflow pipeline features",
    start_date=days_ago(2),
    tags=["example"],
    schedule_interval="@daily",
    catchup=False,
) as dag:
    # [END instantiate_dag]

    # t1, t2 and t3 are examples of tasks created by instantiating operators
    # [START basic_task]
    def test():
        for i in range(0,10):
            print('Hello')
            time.sleep(1)
        return 3
    
    def test1():
        for i in range(0,10):
            print('Hi')
            time.sleep(1)
        return 3

    t = PythonOperator(task_id='test', python_callable=test)

    t1 = PythonOperator(task_id='test1', python_callable=test1)


    [t, t1]

Upvotes: 0

Views: 633

Answers (2)

zorrrba
zorrrba

Reputation: 65

I ended up needing to restart the EC2 after I installed the psql backend and changed the config file. Airflow wasn't updating until I restarted it but that fixed the issue.

Upvotes: 0

Iñigo González
Iñigo González

Reputation: 3955

With a LocalExecutor you should be able to run commands in parallel with this configuration (in your airflow.cfg file).

[core]
executor = LocalExecutor

... more config here ...

# This makes LocalExecutor spawn a process for each task
# No queue needed!
parallelism = 0

Upvotes: 0

Related Questions