Anna
Anna

Reputation: 98

Apache Spark: setting executor instances

I run my Spark application on YARN with parameters:

in spark-defaults.conf:

spark.master yarn-client
spark.driver.cores 1
spark.driver.memory 1g
spark.executor.instances 6
spark.executor.memory 1g

in yarn-site.xml:

yarn.nodemanager.resource.memory-mb 10240

All other parameters are set to default.

I have a 6-node cluster and the Spark Client component is installed on each node. Every time I run the application there are only 2 executors and 1 driver visible in the Spark UI. Executors appears on different nodes.

Why can't Spark create more executors? Why are only 2 instead of 6?

I found a very similar question: Apache Spark: setting executor instances does not change the executors, but increasing the memoty-mb parameter didn't help in my case.

Upvotes: 4

Views: 3222

Answers (1)

Anna
Anna

Reputation: 98

The configuration looks OK at first glance.

Make sure that you have overwritten the proper spark-defaults.conf file.

Execute echo $SPARK_HOME for the current user and verify, if the modified spark-defaults file is in the $SPARK_HOME/conf/ directory. Otherwise Spark cannot see your changes.

I have modified the wrong spark-defaults.conf file. I had two users in my system and each user had a different $SPARK_HOME directory set (I didn't know it before). That's why I couldn't see any effect of my settings for one of the users.

You can run your spark-shell or spark-submit with an argument --num-executors 6 (if you want to have 6 executors). If Spark creates more executors than before, you will be sure, that it's not the memory issue but something with the unreadable configuration.

Upvotes: 1

Related Questions