deltascience
deltascience

Reputation: 3390

Airflow configuration recommendation for LocalExecutor

I am using airflow docker-compose from here and I have some performance issue along with strange behavior of airflow crashing.

First I have 5 DAGs running at the sametime, each one of them has 8 steps with max_active_runs=1:

 step1x
 step2y
 step3 >> step4 >> step8
 step3 >> step5 >> step8
 step3 >> step6 >> step8
 step3 >> step7 >> step8

I would like to know what configuration should I use in order to maximize Airflow parallelism vs Stability. i.e.: I want to know what is the maximum recommanded [OPTIONS BELOW] for a machine that has X CPU and Y GB of RAM. I am using a LocalExecutor but can't figure out how should I configure the parallelism:

AIRFLOW__SCHEDULER__SCHEDULER_MAX_THREADS=?
AIRFLOW__CORE__PARALLELISM=?
AIRFLOW__WEBSERVER__WORKERS=?

Is there a documentation that states the recommandation for each one of those based on your machine specification ?

Upvotes: 2

Views: 525

Answers (1)

Dan Kleiman
Dan Kleiman

Reputation: 86

I'm not sure you have a parallelism problem...yet.

Can you clarify something? You have 5 different dags with similar set-ups? Or this is launching five instances of the same task at once? I'd expect the former because of the max_active_runs setting.

On your task declaration here:

 step1x
 step2y
 step3 >> step4 >> step8
 step3 >> step5 >> step8
 step3 >> step6 >> step8
 step3 >> step7 >> step8

Are you expecting step1x, step2y and step3 to all execute at the same time? Then 4-7 and finally step8? What are you doing in the DAG where you need that kind of process vs 1-8 sequential?

Upvotes: 1

Related Questions