Reputation: 65
I'm trying to run multiple tasks in parallel on a single EC2 instance. I've set up a psql backend and switched to using local executor in addition to changing the configs in the config file. However, my tasks seem to be executing sequentially rather than in parallel. Are you theoretically able to run multiple tasks in parallel with the PythonOperator? Ideally, I would expect to see alternating Hi and Hello's when I run this task but instead I get 10 Hello's and then 10 Hi's. If I can get this toy example working, it will really help with a workflow I'm trying to design.
"test-airflow",
default_args=default_args,
description="airflow pipeline features",
start_date=days_ago(2),
tags=["example"],
schedule_interval="@daily",
catchup=False,
) as dag:
# [END instantiate_dag]
# t1, t2 and t3 are examples of tasks created by instantiating operators
# [START basic_task]
def test():
for i in range(0,10):
print('Hello')
time.sleep(1)
return 3
def test1():
for i in range(0,10):
print('Hi')
time.sleep(1)
return 3
t = PythonOperator(task_id='test', python_callable=test)
t1 = PythonOperator(task_id='test1', python_callable=test1)
[t, t1]
Upvotes: 0
Views: 633
Reputation: 65
I ended up needing to restart the EC2 after I installed the psql backend and changed the config file. Airflow wasn't updating until I restarted it but that fixed the issue.
Upvotes: 0
Reputation: 3955
With a LocalExecutor you should be able to run commands in parallel with this configuration (in your airflow.cfg file).
[core]
executor = LocalExecutor
... more config here ...
# This makes LocalExecutor spawn a process for each task
# No queue needed!
parallelism = 0
Upvotes: 0