Reputation: 427
In order to parallelize spark jobs, both the number of cores and the number of executor instances can be increased. What is the trade-off here and how should one pick the actual values of both the configs?
Upvotes: 2
Views: 2880
Reputation: 4045
I recommend that you read this blog post from Cloudera.
Bench marking you PySpark job by varying no. executors against no. of executor thread is the best way to come up with the right configuration for your application.
Upvotes: 3