fruitJuice
fruitJuice

Reputation: 1241

Switch to "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC" causing more full GC

We have an application that we are trying to optimize latency on, we would like to decrease the average time spent in full GC. With "XX:+UseParallelGC" This is what we see:

[myhost ~]$ /usr/local/jdk7/bin/jstat -gcutil 9793 1000
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
  0.00  49.57  95.93  93.32  99.48  10086  390.628   387 1005.334 1395.962
 56.99   0.00   7.42  93.32  99.48  10087  390.691   387 1005.334 1396.025
 56.99   0.00  22.19  93.32  99.49  10087  390.691   387 1005.334 1396.025
 56.99   0.00  36.28  93.32  99.49  10087  390.691   387 1005.334 1396.025
[myhost ~]$ ps -p 9793 -o etime=
 4-12:40:52

When we switch to using "-XX:+UseParNewGC -XX:+UseConcMarkSweepGC" We see a lot more full GC:

[myhost]$ /usr/local/jdk7/bin/jstat -gcutil 2514 1000
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.20    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.19    716   28.151    24   44.250   72.401
  0.00 100.00 100.00  99.62  24.19    716   28.151    24   44.250   72.401
  0.00   0.00   5.92  99.44  24.19    716   28.151    24   56.361   84.512
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00 100.00  99.66  24.19    718   28.221    26   56.417   84.638
100.00   0.00  34.98  99.87  24.20    720   28.319    27   68.708   97.026
100.00   0.00 100.00  99.87  24.20    721   28.319    28   68.708   97.026
100.00   0.00 100.00  99.87  24.20    721   28.319    28   68.708   97.026
100.00   0.00 100.00  99.87  24.20    721   28.319    28   68.708   97.026
100.00   0.00 100.00  99.87  24.20    721   28.319    28   68.708   97.026

This is our heap setting:

 -Xms256m -Xmx8192m -XX:PermSize=128m -XX:MaxPermSize=1024m

We are using JDK:

[myhost ~]$ /usr/local/jdk7/bin/java -XX:+PrintCommandLineFlags -version
-XX:InitialHeapSize=261346688 -XX:MaxHeapSize=4181547008 -XX:+PrintCommandLineFlags -XX:+UseCompressedOops -XX:+UseParallelGC
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

First instance has 4+ days uptime. The second instance was only up for minutes. We noticed that it was doing frequent full GC with huge pause, so we reverted the settings.

What needs to be adjusted here so that the GC stats are not off the wall?

Upvotes: 0

Views: 2586

Answers (2)

Jans
Jans

Reputation: 11250

By ergonomics, you can provide arguments that together with the default platform-dependent selection can tune the behavior of the GC relating to the application attributes. The fact that changing the collector doesn't specify how that tuning can affect the count of GC performed doesn't tell that the collector selection is wrong, because the problem might be caused by something else.

Talking in terms of behavior, ConcMarkSweepGC guarantee less pause time in the application, but anything in GC events generated. Even though your second test show that was an improvement in the average time taken by Full GC you have to test for a long more time in order to see how the collector adapt itself to meet the metrics.

Based on recent history, the concurrent collector maintains estimates of the time remaining before the tenured generation will be exhausted and of the time needed for a concurrent collection cycle. Based on these dynamic estimates, a concurrent collection cycle will be started with the aim of completing the collection cycle before the tenured generation is exhausted.

Beside that, you have to run a few cases(including testing other collectors) in order to get the right balance between the time taken by the GC (promise of the collector) and the GC events performed, you can get there testing for params that can control how the objects are promoted(based on generational hypothesis), or the acceptable heap usage before getting the collector run, like:

-XX:NewRatio=<N>
-XX:CMSInitiatingOccupancyFraction=<N>

Upvotes: 0

wonhee
wonhee

Reputation: 1661

In previous capture, there are only for record but each event has 1005 full GC time. In your second capture, it has more record shown up but each event only has 44~68 full GC time. So if you summarize the full GC time of your 1st and 2nd capture, 2nd one after you changed the configuration actually have less Full GC time.

Upvotes: 0

Related Questions