user2120041
user2120041

Reputation: 13

Java Glassfish issue

I have a problem that makes me crazy and need your help. The setup is following: - processor 4 cores + HT (so linux says 8 cpus) - centos - glassfish 3 (newest version) - on glassfish only one application is running, http-listener1 which listens on port 8080 (admin listener is on, https off) - there is a connection pool

Everything works as it should for days, and then suddenly stops: - load goes to 400 - cpu usage of the java process goes to 800 - gf stops serving pages, or it serves extremely slow: - admin listener (port 4848) is running ok - I start profiler , everything seems to be working fine, but I can't find what's wrong

I don't have any more ideas where to look and how to solve the problem. It seems that issue appears when huge amount of users visit the site, but the problem is that gf never recovers even without any user.

Any ideas?

EDIT pasted JVM settings from comment:

JVM settings:

<jvm-options>-Xms10240m</jvm-options> 
<jvm-options>-Xmx10240m</jvm-options> 
<jvm-options>-XX:CMSIncrementalDutyCycle=10</jvm-options>  
<jvm-options>-XX:CMSIncrementalDutyCycleMin=10</jvm-options>  
<jvm-options>-XX:+CMSIncrementalMode</jvm-options>  
<jvm-options>-XX:+CMSIncrementalPacing</jvm-options>  
<jvm-options>-XX:+UseConcMarkSweepGC</jvm-options>  
<jvm-options>-XX:MaxPermSize=512m</jvm-options>  
<jvm-options>-XX:NewRatio=2</jvm-options>  
<jvm-options>-XX:PermSize=512m</jvm-options>

Upvotes: 0

Views: 1389

Answers (2)

Pierre Laporte
Pierre Laporte

Reputation: 1215

This 800% CPU usage looks like you have an allocation failure.

When you enable CMS, the GC tries to free memory faster than what your application consumes. An allocation failure is when it cannot meet this requirement. In that case, the only solution the JVM has is to run a full collection using ParallelGC, which means :

  • Your server is completely stopped
  • The ParallelGC tries to finish as fast as possible, using every possible CPU

You should enable GC Logging to make sure this allocation failure assumption is correct (-Xloggc:gc.log -XX:+PrintGCDetails). Each "Full GC" line is an allocation failure.

Once you have GC logs, try to use the following scripts to see if iCMS duty cycles are really around 10% CPU usage. A more detailed explanation is available here.

As @ppeterka said, you can profile your application to reduce memory consumption, but you can also give it more memory. Do not set the same value for -Xms and -Xmx, and remove the flags -XX:CMSIncrementalDutyCycle and -XX:CMSIncrementalDutyCycleMin.

Hope that help !

Upvotes: 0

ppeterka
ppeterka

Reputation: 20726

A server can get into almost indefinite GC thrashing. Had this situation going on for some 3.5 hours on a server, without throwing an OutOfMemory error...

We had a memory leak in the framework we used. What we did:

And then, fix the situation.

  • Also, it might happen that there is no memory leak, just that the GC settings need to be adjusted.
  • It is probably wise to turn on the GC logging to see what is going on

Upvotes: 3

Related Questions