Reputation: 672
we have certain task which requires huge amount of resources which can't be run with high parallelism and many other smaller tasks which are can run at parallelism of 32.
I am aware of parallelism config
The amount of parallelism as a setting to the executor. This defines the max number of task instances that should run simultaneously on this airflow installation parallelism = 32
Is there a way where we can tag tasks and different level of parallelism for different tasks at entire airflow level.
Like having smaller task to run at default parallelism [32] but heavy task at much lower parallelism [1-4]
Upvotes: 0
Views: 241
Reputation: 3064
Pools (docs: https://airflow.apache.org/docs/apache-airflow/stable/concepts/pools.html) serve exactly this purpose: to limit the parallelism for a specific set of tasks.
You can create pools with your desired # of "slots" in the Airflow UI, and assign the pool to your task:
my_task = BashOperator(
...,
pool="heavy_task_pool",
...,
)
Upvotes: 1