Reputation: 58
I'm using spark 1.5.2 in standalone deployment mode and launching using scripts. The executor memory is set via 'spark.executor.memory' in conf/spark-defaults.conf. This sets the same memory limit for all the worker nodes. I would like to make it such one can set a different limit for different nodes. How can I do that?
(Spark 1.5.2, ubuntu 14.04)
Thanks,
Upvotes: 3
Views: 655
Reputation: 737
I don't believe there is any way to have heterogeneously sized executors. You can limit the sizes of the worker nodes, but this essentially limits the total amount of memory that the worker can allocate on that box.
When you're running your application, you can only specify the spark.executor.memory at the application level which requests that much memory, per executor per worker.
If you're in a situation where you have heterogeneously sized boxes, what you can do is set SPARK_WORKER_MEMORY to be the smaller amount and then set SPARK_WORKER_INSTANCES = 2 on the larger boxes and SPARK_WORKER_INSTANCES = 1 on the smaller box. In this example we assume your larger boxes are twice as large as the smaller boxes. Then you'll end up using all of the memory on the larger boxes and all of the memory on the smaller boxes, by having twice as many executors on the larger boxes.
Upvotes: 1