Reputation: 3836
I have a standalone Java application which has:
-Xmx1024m -Xms1024m -XX:MaxPermSize=256m -XX:PermSize=256m
Over the course of time it hogs more and more memory, starts to swap(and slow down) and eventually died a number of times(not OOM+dump, just died, nothing on /var/log/messages).
What I've tried so far:
Checking vmstat(obviously we start to swap):
--------------------------procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
Sun Jul 22 16:10:26 2012: r b swpd free buff cache si so bi bo in cs us sy id wa st
Sun Jul 22 16:48:41 2012: 0 0 138652 2502504 40360 706592 1 0 169 21 1047 206 20 1 74 4 0
. . .
Sun Jul 22 18:10:59 2012: 0 0 138648 24816 58600 1609212 0 0 124 669 913 24436 43 22 34 2 0
Sun Jul 22 19:10:22 2012: 33 1 138644 33304 4960 1107480 0 0 100 536 810 19536 44 22 23 10 0
Sun Jul 22 20:10:28 2012: 54 1 213916 26928 2864 578832 3 360 100 710 639 12702 43 16 30 11 0
Sun Jul 22 21:10:43 2012: 0 0 629256 26116 2992 467808 84 176 278 1320 1293 24243 50 19 29 3 0
Sun Jul 22 22:10:55 2012: 4 0 772168 29136 1240 165900 203 94 435 1188 1278 21851 48 16 33 2 0
Sun Jul 22 23:10:57 2012: 0 1 2429536 26280 1880 169816 6875 6471 7081 6878 2146 8447 18 37 1 45 0
sar also shows steady system% growth = swapping:
15:40:02 CPU %user %nice %system %iowait %steal %idle
17:40:01 all 51.00 0.00 7.81 3.04 0.00 38.15
19:40:01 all 48.43 0.00 18.89 2.07 0.00 30.60
20:40:01 all 43.93 0.00 15.84 5.54 0.00 34.70
21:40:01 all 46.14 0.00 15.44 6.57 0.00 31.85
22:40:01 all 44.25 0.00 20.94 5.43 0.00 29.39
23:40:01 all 18.24 0.00 52.13 21.17 0.00 8.46
12:40:02 all 22.03 0.00 41.70 15.46 0.00 20.81
Checking pmap gaves the following largest contributors:
000000005416c000 1505760K rwx-- [ anon ]
00000000b0000000 1310720K rwx-- [ anon ]
00002aaab9001000 2079748K rwx-- [ anon ]
Trying to correlate addresses I've got from pmap from stuff dumped by strace gave me no matches
Adding more memory is not practical(just make problem appear later)
And the question is: What else can I try to track down the problem's cause or try to work around it?
Upvotes: 2
Views: 2397
Reputation: 5310
There is a known problem with Java and glibc >= 2.10 (includes Ubuntu >= 10.04, RHEL >= 6).
The cure is to set this env. variable:
export MALLOC_ARENA_MAX=4
There is an IBM article about setting MALLOC_ARENA_MAX https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en
resident memory has been known to creep in a manner similar to a memory leak or memory fragmentation.
search for MALLOC_ARENA_MAX on Google or SO for more references.
You might want to tune also other malloc options to optimize for low fragmentation of allocated memory:
# tune glibc memory allocation, optimize for low fragmentation
# limit the number of arenas
export MALLOC_ARENA_MAX=2
# disable dynamic mmap threshold, see M_MMAP_THRESHOLD in "man mallopt"
export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536
Upvotes: 1
Reputation: 3836
Problem was in a profiler library attached - it recorded CPU calls/allocation sites, thus required memory to store that.
So, human factor here :)
Upvotes: 1
Reputation: 719446
Something in your JVM is using an "unbounded" amount of non-Heap memory. Some possible candidates are:
The first possibility will show up as a large (and increasing) number of threads when you take a thread stack dump. (Just check it ... OK?)
The second one you can (probably) eliminate if your application (or some 3rd part library it uses) doesn't use any native libraries.
The third one you can eliminate if your application (or some 3rd part library it uses) doesn't use memory mapped files.
I would guess that the reason that you are not seeing OOME's is that your JVM is being killed by the Linux OOM killer. It is also possible that the JVM is bailing out in native code (e.g. due to a malloc failure not being handled properly), but I'd have thought that a JVM crash dump would be the more likely outcome ...
Upvotes: 1