Garbage collection tuning a production application

Question

I've been tasked with tuning a production application that consists of a Spring MVC REST interface serving large (~0mb - 100mb) json documents from a Gemfire in memory cache backend. The application runs on a CentOS server inside Tomcat 7 on JDK 1.6. We realized that the application needed to be tuned because we were seeing frequent stop the world old generation garbage collections which would eventually lead to java.lang.OutOfMemoryError: GC overhead limit exceeded errors if left unattended.

Through some trial and error and monitoring I've managed to tune the application with these parameters:

-Xms20g 
-Xmx20g 

-XX:PermSize=256m 
-XX:MaxPermSize=256m

-XX:NewSize=8g 
-XX:MaxNewSize=8g

-XX:SurvivorRatio=8
-XX:+DisableExplicitGC
-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=70

The garbage collection behavior that I'm seeing now (48 hours under heavy test load) is that eden space collection is happening about once every 10 seconds and lasting about .04 seconds. The old generation is not growing at all after 48 hours and there have been 0 collections in that space.

My question is should I be concerned about not having the old generation garbage collected? Overall does this look like a healthy tuning?

Edit: For anyone who cares my GC log is available here http://filebin.ca/2U8awo1KTS1D/udf-gc.log.0

the8472 · Accepted Answer

My question is should I be concerned about not having the old generation garbage collected?

The logs look fine. Given the trends the old gen occupancy grows very slowly. So it will take several days until it becomes full enough for a concurrent marking cycle to be initiated.

Overall does this look like a healthy tuning?

it seems like you're giving it much more memory than it needs.

Old gen occupancy is around 2G / 12G. This means you could probably shrink it to 4G and still take many hours before a concurrent cycle gets started

Most young objects only live to age 1 (out of 15) in the young generation. This means the young generation could be shrunk too without increasing object promotion too much

-XX:CMSInitiatingOccupancyFraction=70

That should be combined with XX:+UseCMSInitiatingOccupancyOnly

Garbage collection tuning a production application

Answers (2)

Related Questions