Spark 2.2 fails with more memory or workers, succeeds with very little memory and few workers

Question

We have a Spark 2.2 job writte in Scala running in a YARN cluster that does the following:

Read several thousand small compressed parquet files (~15kb each) into two dataframes
Join the dataframes on one column
Foldleft over all columns to clean some data
Drop duplicates
Write result dataframe to parquet

The following configuration fails via java.lang.OutOfMemory java heap space:

However, this job works reliably if we remove spark.executor.memory entirely. This gives each executor 1g of ram.

This job also fails if we do any of the following:

Can anyone help me understand why more memory and more executors leads to failed jobs due to OutOfMemory?

Answers (1)