qingpan
qingpan

Reputation: 426

Are the more partitions the better in Spark?

I am new in Spark and have a question

Are the more partitions the better in Spark ? If I have OOM issue, more partitions helps ?

Upvotes: 1

Views: 2392

Answers (1)

Dazzler
Dazzler

Reputation: 847

Partitions determine the degree of parallelism. Apache Spark doc says that, the partitions size should be atleast equal to the number of cores in the cluster.

In case of very few partitions, all the cores in the cluster would not be utilized. If there are too many partitions and data is small, then too many small tasks gets scheduled.

If your getting the out of memory issue, you would have to increase the executor memory. It should be a minimum of 8GB.

Upvotes: 1

Related Questions