Why a spark executor can have a shuffle read exceed memory allocation?

Question

I have set up the parameters for a spark application as follow:

--conf spark.executor.instances=20 
--conf spark.driver.memory=4G 
--conf spark.executor.memory=14G 
--conf spark.executor.cores=8

there is a shuffle stage in my job, and when I check spark UI, I find out several executors read more than 20G of shuffle data (shuffle read size), and there is no out of memory exception.

Can someone explain why the executor can read more data than the amount of memory allocated (14G)?

Tim · Accepted Answer

TL;DR: Shuffle stores intermediate results on disk.

Most Spark operations operate in a streaming fashion. Let's write a simple program to read a 1T text file and count the lines:

sc.textFile("BIG_FILE.txt").count()

I can run this program on one small computer if I need (though it'll take a long time), because only one line needs to be in memory at a time. Shuffle operates much the same way, writing the incoming data from other nodes to disk, and streaming through it later.

Why a spark executor can have a shuffle read exceed memory allocation?

Answers (1)

Related Questions