Reputation: 31
I am using hadoop 2.4. The reducer use several large memory mapped files (about 8G total). The reducer itself uses very little memory. To my knowledge, the memeory mapped file (FileChannel.map(readonly)
) also uses little memory (managed by OS instead of JVM).
I got this error:
Container [pid=26783,containerID=container_1389136889967_0009_01_000002]
is running beyond physical memory limits.
Current usage: 4.2 GB of 4 GB physical memory used;
5.2 GB of 8.4 GB virtual memory used. Killing container
Here was my settings:
mapreduce.reduce.java.opts=-Xmx2048m
mapreduce.reduce.memory.mb=4096
So I adjust the parameter to this and works:
mapreduce.reduce.java.opts=-Xmx10240m
mapreduce.reduce.memory.mb=12288
I further adjust the parameters and get it work like this:
mapreduce.reduce.java.opts=-Xmx2048m
mapreduce.reduce.memory.mb=10240
My question is: why I need the yarn container to have about 8G more memory than the JVM size? The culprit seems to be the large Java memory mapped files I used (each about 1.5G, sum up to about 8G). Isn't the memory mapped files managed by the OS and they supposed to be sharable by multiple processes (e.g. reducers)?
I use AWS m2.4xlarge instance (67G memory) and it has about 8G unused and the OS should have sufficient memory. In current settings, there are only about 5 reducers available for each instance, and each reducer has extra 8G memory. This just looks very stupid.
Upvotes: 3
Views: 5831
Reputation: 500
Please check below link, there may be a need to tune the property mapreduce.reduce.shuffle.input.buffer.percent
Out of memory error in Mapreduce shuffle phase
Upvotes: 0
Reputation: 4982
From the logs, it seems that you have enabled yarn.nodemanager.pmem-check-enabled
and yarn.nodemanager.vmem-check-enabled
properties in yarn-site.xml
. If these checks are enabled, then NodeManger
may kill container(s), if it detects that the container(s) exceeded the resource limits. In your case, physical memory exceeded the configured value (=4G) so NodeManager
killed the task (running within the container).
In normal cases, heap memory (defined using -Xmx
property in mapreduce.reduce.java.opts
and mapreduce.map.java.opts
configurations) is defined 75-80% of the total memory (defined using mapreduce.reduce.memory.mb
and mapreduce.map.memory.mb
configurations). However, in your case due to Java Memory Mapped Files implementation, non-heap memory requirements are higher than heap memory, and thats why you had to keep quite large gap between total and heap memory.
Upvotes: 2