Scudeler
Scudeler

Reputation: 91

Cassandra and G1 Garbage collector stop the world event (STW)

We have a 6 node Cassandra Cluster under heavy utilization. We have been dealing a lot with garbage collector stop the world event, which can take up to 50 seconds in our nodes, in the meantime Cassandra Node is unresponsive, not even accepting new logins.

Extra details:

Any help would be very much appreciated!

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here


Edit 1:

Checking object creation stats, it does not look healthy at all.

enter image description here


Edit 2:

I have tried to use the suggested settings by Chris Lohfink, here is the GC report:

Using CMS suggested settings http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMTAvOC8tLWdjLmxvZy4wLmN1cnJlbnQtLTE5LTAtNDk=

Using G1 suggested settings http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMTAvOC8tLWdjLmxvZy4wLmN1cnJlbnQtLTE5LTExLTE3

The behavior remains basically the same:

  1. Old Gen starts to fill up.
  2. GC can't clean it properly without a full GC and a STW event.
  3. The full GC starts to take longer, until the node is completely unresponsive.

I'm going to get the cfstats output for maximum partition size and tombstones per read asap and edit the post again.

Upvotes: 2

Views: 3171

Answers (2)

Chris Lohfink
Chris Lohfink

Reputation: 16420

Without knowing what your existing settings or possible data model problems, heres a guess of some conservative settings to use to try to reduce evacuation pauses from not having enough to-space (check gc logs):

-Xmx12G -Xms12G -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:MaxGCPauseMillis=500 -XX:-ReduceInitialCardMarks -XX:G1HeapRegionSize=32m

This should also help reduce the pause of the update remember set which becomes an issue and reducing humongous objects, by setting G1HeapRegionSize, which can become a problem depending on data model. Make sure -Xmn is not set.

12Gb with C* is probably more suited for using CMS for what its worth, you can get better throughput certainly. Just need to be careful of fragmentation over time with the rather large objects that can get allocated.

-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=55 -XX:MaxTenuringThreshold=3 -Xmx12G -Xms12G -Xmn3G -XX:+CMSEdenChunksRecordAlways -XX:+CMSParallelInitialMarkEnabled -XX:+CMSParallelRemarkEnabled -XX:CMSWaitDuration=10000 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCondCardMark 

Most likely theres an issue with data model or your under provisioned though.

Upvotes: 2

Gil Tene
Gil Tene

Reputation: 798

Have you looked at using Zing? Cassandra situations like these are a classic use case, as Zing fundamentally eliminates all GC-related glitches in Cassandra nodes and clusters.

You can see some details on the how/why in my recent "Understanding GC" talk from JavaOne (https://www.slideshare.net/howarddgreen/understanding-gc-javaone-2017). Or just skip to slides 56-60 for Cassandra-specific results.

Upvotes: 3

Related Questions