oleber
oleber

Reputation: 1079

Why does Spark application in Docker container fail with OutOfMemoryError: Java heap space?

I'm using an r4.8xlarge on AWS Batch Service to run Spark. This is already a big machine, 32 vCPU, and 244 GB. On AWS Batch Service the process runs inside a Docker container. Out of multiple sources, I saw that we should use java with the parameters:

-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=1

Even with this parameters the process never when over 31Gb resident memory and 45 GB of virtual memory.

As analyzes I did:

java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=1 -XshowSettings:vm -version
VM settings:
    Max. Heap Size (Estimated): 26.67G
    Ergonomics Machine Class: server
    Using VM: OpenJDK 64-Bit Server VM

openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

second test

docker run -it --rm 650967531325.dkr.ecr.eu-west-1.amazonaws.com/java8_aws java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2 -XshowSettings:vm -version
VM settings:
    Max. Heap Size (Estimated): 26.67G
    Ergonomics Machine Class: server
    Using VM: OpenJDK 64-Bit Server VM

openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

third test

java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=10 -XshowSettings:vm -version
VM settings:
    Max. Heap Size (Estimated): 11.38G
    Ergonomics Machine Class: server
    Using VM: OpenJDK 64-Bit Server VM

openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

The system is build with Native Packager as a standalone application. A SparkSession is built as follows with Cores equal to 31 (32-1):

SparkSession
  .builder()
  .appName(applicationName)
  .master(s"local[$Cores]")
  .config("spark.executor.memory", "3g")

Answer to egorlitvinenko:

$ docker stats
CONTAINER ID        NAME                                                                    CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
0c971993f830        ecs-marcos-BatchIntegration-DedupOrder-3-default-aab7fa93f0a6f1c86800   1946.34%            27.72GiB / 234.4GiB   11.83%              0B / 0B             72.9MB / 160kB      0
a5d6bf5522f6        ecs-agent                                                               0.19%               19.56MiB / 240.1GiB   0.01%               0B / 0B             25.7MB / 930kB      0

More tests, now with Oracle JDK, the memory never went over 4G:

$ 'spark-submit' '--class' 'integration.deduplication.DeduplicationApp' '--master' 'local[31]' '--executor-memory' '3G' '--driver-memory' '3G' '--conf' '-Xmx=150g' '/localName.jar' '--inPath' 's3a://dp-import-marcos-refined/platform-services/order/merged/*/*/*/*' '--outPath' 's3a://dp-import-marcos-refined/platform-services/order/deduplicated' '--jobName' 'DedupOrder' '--skuMappingPath' 's3a://dp-marcos-dwh/redshift/item_code_mapping'

I used the parameters -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=2 on my Spark, clearly not using all the memory. How can I go about this issue?

Upvotes: 1

Views: 2253

Answers (1)

Jacek Laskowski
Jacek Laskowski

Reputation: 74779

tl;dr Use --driver-memory and --executor-memory while spark-submit your Spark application or set the proper memory settings of the JVM that hosts the Spark application.


The memory for the driver is by default 1024M which you can check out using spark-submit:

--driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 1024M).

The memory for the executor is by default 1G which you can check out again using spark-submit:

--executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).

With that said, it does not really matter how much memory your execution environment has in total as a Spark application won't use more that the default 1G for the driver and executors.

Since you use local master URL the memory settings of the driver's JVM are already set when you execute your Spark application. It is simply too late to set the memory settings while creating a SparkSession. The single JVM of the Spark application (with the driver and a single executor all running on the same JVM) has already been up and so no config can change it.

In other words, how much memory a Docker container has has no impact on how much memory the Spark application use. They are environments configured independently. Of course, the more memory a Docker container has the more a process inside could ever have (so they are indeed interconnected).

Use --driver-memory and --executor-memory while spark-submit your Spark application or set the proper memory settings of the JVM that hosts the Spark application.

Upvotes: 1

Related Questions