Reputation: 55
I looked everywhere but couldn't find the answer I need. I'm running Spark 1.5.2 in Standalone Mode, SPARK_WORKER_INSTANCES=1 because I only want 1 executor per worker per host. What I would like is to increase the number of hosts for my job and hence the number of executors. I've tried changing spark.executor.instances and spark.cores.max in spark-defaults.conf, still seeing the same number of executors. People suggest changing --num-executors, is that not the same as spark.executor.instances?
This Cloudera blog post http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ says "The --num-executors command-line flag or spark.executor.instances configuration property control the number of executors requested. Starting in CDH 5.4/Spark 1.3, you will be able to avoid setting this property by turning on dynamic allocation with the spark.dynamicAllocation.enabled property" but I"m not sure if spark.dynamicAllocation.enabled only works for YARN.
Any suggestion on how to do this for Spark 1.5.2 is great appreciated!
Upvotes: 0
Views: 1698
Reputation: 8528
I don't believe you need to setup SPARK_WORKER_INSTANCES
! and if you want to use it, you need to set SPARK_WORKER_CORES
environment variable, otherwise, you will end up with a worker consuming all the cores. Hence, the other workers can't be launched correctly!
I haven't seen spark.executor.instances
used outside YARN Configuration with Spark
That said, I would definitely suggest using --num-executors
having your cluster have multiple workers!
Upvotes: 0