user7343922
user7343922

Reputation: 306

spark number of executors when dynamic allocation is enabled

I have a r5.8xlarge AWS cluster with 12 nodes, so there are 6144 cores (12nodes * 32vCPU * 16cores), I have set --executor-cores=5 and enabled the dynamic execution using the below spark-submit command, even after setting the spark.dynamicAllocation.initialExecutors=150 --conf spark.dynamicAllocation.minExecutors=150, I'm only seeing 70 executors in the spark-UI application, what am I doing wrong?

r5.8xlarge clusters have 256GB per node, so 3072GB(256GB*12nodes)

FYI -I'm not including the driver node in this calculation.

--driver-memory 200G --deploy-mode client --executor-memory 37G --executor-cores 7 --conf spark.dynamicAllocation.enabled=true --conf spark.shuffle.service.enabled=true --conf spark.driver.maxResultSize=0 --conf spark.sql.shuffle.partitions=2000 --conf spark.dynamicAllocation.initialExecutors=150  --conf spark.dynamicAllocation.minExecutors=150 

Upvotes: 1

Views: 1458

Answers (1)

Abdennacer Lachiheb
Abdennacer Lachiheb

Reputation: 4888

You have 256GB per node and 37G per executor, an executor can only be in one node (a executor cannot be shared between multiple nodes), so for each node you will have at most 6 executors (256 / 37 = 6), since you have 12 nodes so the max number of executors will be 6 * 12 = 72 executor which explain why you see only 70 executor in your spark ui (the difference of 2 executor's is caused by the memory allocated to the driver or maybe because of some memory allocation problem in some nodes).

If you want more executors then you have to decrease the memory of the executors, also to fully utilize your cluster make sure that the reminder of the the node memory divided by the executor memory is as close to zero as possible, ex:

  • 256GB per node and 37G per executor: 256 / 37 = 6.9 => 6 executor per node (34G lost per node)

  • 256GB per node and 36G per executor: 256 / 36 = 7.1 => 7 executor per node ( only 4G lost per node, so you gain 30G of unused memory per node)

If you want at least 150 executor then executor memory should be at most 19G

Upvotes: 2

Related Questions