J.Done
J.Done

Reputation: 3033

executor memory setting in spark

I made standalone cluster and wanted to find the fastest way to process my app. My machine has 12g ram. Here is some result I tried.

Test A (took 15mins)
1 worker node
spark.executor.memory = 8g
spark.driver.memory = 6g

Test B(took 8mins)
2 worker nodes
spark.executor.memory = 4g
spark.driver.memory = 6g

Test C(took 6mins)
2 worker nodes
spark.executor.memory = 6g
spark.driver.memory = 6g

Test D(took 6mins)
3 worker nodes
spark.executor.memory = 4g
spark.driver.memory = 6g

Test E(took 6mins)
3 worker nodes
spark.executor.memory = 6g
spark.driver.memory = 6g
  1. Compared Test A, Test B just made one more woker (but same memory spend 4*2=8) but It made app fast. Why it happened?
  2. Test C, D, E tried to spend much more memory than it had. but It worked and even faster. is config memory size just for limiting edge of memory?
  3. It does not just as fast as adding worker nodes. How should I know profit number of worker and executor memory size?

Upvotes: 1

Views: 1046

Answers (1)

imriqwe
imriqwe

Reputation: 1455

On TestB, your application was running in parallel on 2 CPUs, therefore the total amount of time was almost a half.

Regarding memory - memory setting defines an upper limit Setting a small amount will make your. app to perform more GC, and if eventually your heap gets full, you'll receive an OutOfMemoryException.

Regarding the most suitable configuration - well, it depends. If your task does not consume much RAM - configure Spark to have as much executors as your CPUs. Otherwise, configure your executors to match the appropriate amount of RAM required. Keep in mind that those limitations should not be constant, and might be changed by your application requirements.

Upvotes: 1

Related Questions