york
york

Reputation: 31

Yarn container out of memory when using large memory mapped file

I am using hadoop 2.4. The reducer use several large memory mapped files (about 8G total). The reducer itself uses very little memory. To my knowledge, the memeory mapped file (FileChannel.map(readonly)) also uses little memory (managed by OS instead of JVM).

I got this error:

Container [pid=26783,containerID=container_1389136889967_0009_01_000002] 
is running beyond physical memory limits. 
Current usage: 4.2 GB of 4 GB physical memory used;
5.2 GB of 8.4 GB virtual memory used. Killing container

Here was my settings:

mapreduce.reduce.java.opts=-Xmx2048m

mapreduce.reduce.memory.mb=4096

So I adjust the parameter to this and works:

mapreduce.reduce.java.opts=-Xmx10240m

mapreduce.reduce.memory.mb=12288

I further adjust the parameters and get it work like this:

mapreduce.reduce.java.opts=-Xmx2048m

mapreduce.reduce.memory.mb=10240

My question is: why I need the yarn container to have about 8G more memory than the JVM size? The culprit seems to be the large Java memory mapped files I used (each about 1.5G, sum up to about 8G). Isn't the memory mapped files managed by the OS and they supposed to be sharable by multiple processes (e.g. reducers)?

I use AWS m2.4xlarge instance (67G memory) and it has about 8G unused and the OS should have sufficient memory. In current settings, there are only about 5 reducers available for each instance, and each reducer has extra 8G memory. This just looks very stupid.

Upvotes: 3

Views: 5831

Answers (2)

Vijayanand
Vijayanand

Reputation: 500

Please check below link, there may be a need to tune the property mapreduce.reduce.shuffle.input.buffer.percent

Out of memory error in Mapreduce shuffle phase

Upvotes: 0

Vasu
Vasu

Reputation: 4982

From the logs, it seems that you have enabled yarn.nodemanager.pmem-check-enabled and yarn.nodemanager.vmem-check-enabled properties in yarn-site.xml. If these checks are enabled, then NodeManger may kill container(s), if it detects that the container(s) exceeded the resource limits. In your case, physical memory exceeded the configured value (=4G) so NodeManager killed the task (running within the container).

In normal cases, heap memory (defined using -Xmx property in mapreduce.reduce.java.opts and mapreduce.map.java.opts configurations) is defined 75-80% of the total memory (defined using mapreduce.reduce.memory.mb and mapreduce.map.memory.mb configurations). However, in your case due to Java Memory Mapped Files implementation, non-heap memory requirements are higher than heap memory, and thats why you had to keep quite large gap between total and heap memory.

Upvotes: 2

Related Questions