kee
kee

Reputation: 11619

Garbage Collection Optimization in Java/Tomcat/Solr

When there is a replication between master and slave in solr (tomcat is the container), there is a GC spike (takes about 200ms) and it seems to reclaim a lot more resource (memory) than necessary (big and sharp drop of used memory amount). First of all, is this 200ms reasonable? Something what other folks are seeing? Secondly is there a way to make GC less drastic (reclaiming less so that the disruption is less) but I am not sure what I am trying to do is doable or whether I am attacking the problem in the right direction.

Here are my GC parameters for your reference:

-XX:+DisableExplicitGC 
-XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled
-XX:CMSInitiatingOccupancyFraction=30
-XX:ParallelCMSThreads=6 
-XX:PermSize=64m 
-XX:MaxPermSize=64m 
-Xms32g 
-Xmx32g 
-XX:NewSize=512m
-XX:MaxNewSize=512m
-XX:TargetSurvivorRatio=90 
-XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=15 
-XX:+UseStringCache 
-XX:+OptimizeStringConcat 
-XX:+UseCompressedOops 
-XX:+PrintGC 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=...
-XX:+UseNUMA 
-XX:+UseCompressedStrings 
-XX:+UseBiasedLocking

Upvotes: 4

Views: 4089

Answers (4)

Yonik
Yonik

Reputation: 2381

One way to solve Solr garbage collection issues is to move many of the large data structures like the filterCache and the FieldCache off-heap.

Heliosearch is a Solr fork that does just that (off-heap data structures). See the following blogs for performance results so far:

http://heliosearch.org/off-heap-filters/

http://heliosearch.org/solr-off-heap-fieldcache/

Upvotes: 1

Gil Tene
Gil Tene

Reputation: 798

There is actually a quick and simple way around these sort of GC related timeouts that doesn't depend on complicated data gathering and tuning, and that will work every time as long as you are running on Linux.

As noted elsewhere, whether or not the timeout spikes caused by your Newgen, CMS, or FullGC pauses are acceptable depends on your requirements. Also, it is true that tuning the HotSpot GC mechanisms is a complicated art, and that you would normally need a lot more detail and iterative experimentation to figure out how to improve on your current behavior.

However, if you want all those pauses and related timeouts gone without getting a PhD in GC tuning, there is a simple, slam dunk way to do that: the Zing JVM will run that 32GB heap Solr setup with GC never breaking sweat, and without any GC-related pauses, disruptions or related timeouts. And it will do so out of the box, with default parameters, and practically no tuning.

And yes, I work at Azul, and proud of it. We save people with this sort of problem weeks of effort and tons if timeout-related embarrassment all the time.

Upvotes: 5

Aleš
Aleš

Reputation: 9028

What is and is not reasonable in the terms of the GC spike depends on the given application.

You need to observe the GC behavior over a longer period of time to reason about some spikes being unreasonably higher than others.

The FullGC pauses in 1-3 seconds are relatively reasonable with 16-32GB heap sizes. YoungGC can be around 200ms.

Upvotes: 1

codethulhu
codethulhu

Reputation: 3996

Garbage collection tuning is a complicated topic. Your garbage collection pause may or may not be too long, depending on your needs. We can't know those requirements. Your heap size may or may not be sized correctly. Your heap may not be partitioned correctly. You may benefit from using different garbage collection algorithms. We can't answer those questions for you. There is no correct formula for garbage collection. As such, all you can do is start modifying it until you meet whatever satisfies your application run time behavior characteristics.

There are lots of options for how to manage your JVM. You can start here.

Upvotes: 3

Related Questions