Reputation: 5254
Is there a maximum number of DAGs that can be run in 1 Airflow or Cloud Composer environment?
If this is dependent on several factors (Airflow infrastructure config, Composer cluster specs, number of active runs per DAG etc..) what are all the factors that affect this?
Upvotes: 2
Views: 4254
Reputation: 94
it depends on the available resources allowed to your airflow and the type of the executor. And there is a maximum amount of allowed tasks and dags to run concurrently and simultaneously defined in the [core] section of airflow.cfg :
# The amount of parallelism as a setting to the executor. This defines
# the max number of task instances that should run simultaneously
# on this airflow installation
parallelism = 124
# The number of task instances allowed to run concurrently by the scheduler
dag_concurrency = 124
# The maximum number of active DAG runs per DAG
max_active_runs_per_dag = 500
Upvotes: 0
Reputation: 5254
I found from Composer docs that Composer uses CeleryExecutor
and runs it on Google Kubernetes Engine (GKE).
There is no limit on the maximum number of dags in Airflow and it is a function of the resources (nodes, CPU, memory) available and then assuming there are resources available, the Airflow configuration options are just a limit setting that will be a bottleneck and have to be modified.
There is a helpful guide on how to do this in Cloud Composer here. So once you enable autoscaling in the underlying GKE cluster, and unlock the hard-limits specified in the Airflow configuration, there should be no limit to maximum number of tasks.
For vanilla Airflow, it will depend on the executor you are using in Airflow, and it will be easier to scale up if you use the KubernetesExecutor
and then handle the autoscaling in K8s.
If you are using LocalExecutor
then you can improve this if you are facing slow performance by increasing the resources allocated to your Airflow installation (CPU, memory).
Upvotes: 2