Jens Bannmann
Jens Bannmann

Reputation: 5105

How to diagnose endless JVM GC runs?

We have a Java EE application with a heap size of several gigabytes on our production servers. Every once in a while, any one of our servers no longer reacts to any requests.

How would you start diagnosing this?

Configuration

Our JAVA_OPTS are the following: -Xms8096m -Xmx8096m -XX:MaxPermSize=512M -Dsun.rmi.dgc.client.gcInterval=1800000 -Dsun.rmi.dgc.server.gcInterval=1800000 -XX:NewSize=150M -XX:+UseParNewGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/path/to/gc.log

$ java -version
java version "1.6.0_12"
Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)

$ uname -a
Linux myhostname 2.6.18-274.3.1.el5 #1 SMP Tue Sep 6 20:13:52 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/version
Linux version 2.6.18-274.3.1.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Tue Sep 6 20:13:52 EDT 2011

$ cat /etc/issue
CentOS release 5.7 (Final)
Kernel \r on an \m

$ cat /proc/meminfo|grep "MemTotal"
MemTotal:     16279356 kB

GC log

This is an example GC log snippet from when the problem occurred:

111036.554: [GC 111036.555: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111036.555: [Tenured: 3629252K->3647971K(5526912K), 8.7565190 secs] 5840068K->3647971K(8014016K), 8.7567840 secs]
111055.691: [GC 111055.691: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111055.691: [Tenured: 3647971K->3667529K(5526912K), 8.7876340 secs] 5858787K->3667529K(8014016K), 8.7878690 secs]
111071.037: [GC 111071.037: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111071.037: [Tenured: 3667529K->3692057K(5526912K), 8.7581830 secs] 5878345K->3692057K(8014016K), 8.7584210 secs]
111088.407: [GC 111088.407: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111088.407: [Tenured: 3692057K->3638194K(5526912K), 10.7072790 secs] 5902873K->3638194K(8014016K), 10.7074960 secs]
111110.238: [GC 111110.238: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111110.238: [Tenured: 3638194K->3654614K(5526912K), 8.8021440 secs] 5849010K->3654614K(8014016K), 8.8023860 secs]
111128.115: [GC 111128.115: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111128.115: [Tenured: 3654614K->3668670K(5526912K), 8.8451510 secs] 5865430K->3668670K(8014016K), 8.8453600 secs]
111161.684: [GC 111161.684: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111161.684: [Tenured: 3668670K->3684080K(5526912K), 8.8156740 secs] 5879486K->3684080K(8014016K), 8.8159260 secs]
111186.669: [GC 111186.669: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111186.669: [Tenured: 3684080K->3639333K(5526912K), 10.6025350 secs] 5894896K->3639333K(8014016K), 10.6030040 secs]
111208.692: [GC 111208.692: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111208.692: [Tenured: 3639333K->3657993K(5526912K), 8.7967920 secs] 5850149K->3657993K(8014016K), 8.7970090 secs]
111235.486: [GC 111235.487: [ParNew: 2210816K->2210816K(2487104K), 0.0000090 secs]111235.487: [Tenured: 3657993K->3676521K(5526912K), 8.8212340 secs] 5868809K->3676521K(8014016K), 8.8214930 secs]

Upvotes: 4

Views: 543

Answers (3)

Fabian Lange
Fabian Lange

Reputation: 1856

It usually takes a bit of experimentation, but my guess is that you are seeing this problem every 30 minutes, right? When you have a full GC, like you have with those tenured lines, which does not seem to be caused by a full heap, they are very likely to be caused by System.gc(). Normally in the log it would be printed as (System), but because your VM is old, I am not sure. The GC log information changes every release. So to eliminate that, I strongly recommend to use "-XX:+DisableExplicitGC". It also ignores the call the DGC does.

The second option here might be that you have a memory leak/problem you failed to see in the dump. The lines say that you always have 2210816K in new, and old is still at 3676521K in the end. If all of the 2210816K are alive (what it looks like) then its impossible to move them to Tenured, because it will not fit (5887337K) so if it does not stop doing that you might get a GC Overhead exceeded message. But in that case you have to have those 6G live objects in heap when you took the dump

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533520

As you have a very old version of Java (almost four years old) and you appear to be getting into an error state, the first thing to try is a newer version such as Java 6 update 35. I suspect update 12 doesn't have Compressed Oops on by default which is an option which should save you some memory (and thus overhead)

Upvotes: 2

Jean-Philippe Briend
Jean-Philippe Briend

Reputation: 3515

First of all : a thread dump ! Kill -3 the process and check the log outputs.

You can confirm the GC by seeing the GC threads running.

You use 8Go for the JVM. How many gigs for the RAM ? (I hope at least 12).

Thread dump will show you the Eden, From, To, Permgen sizes. This may help to find which memory space is having a problem.

Upvotes: 1

Related Questions