Reputation: 21
This is a situation for the company that I work for, that's causing to loose my sleep, since I'm in front of the wall and finding no options. I have a production cluster of 51 cassandra 3.11.9 nodes, that works with a large load (from 600 to 800 GB each node), that is presenting a very annoying issue: For an unknown reason, there machines start doing GC pauses, that really hurt the response time of my application, since the consistency level on clients is ONE,creating peaks of latency.
Here are some samples of:
28 | CHANGED | rc=0 >>
INFO [Service Thread] 2023-07-14 01:56:45,583 GCInspector.java:285 - G1 Old Generation GC in 12039ms. G1 Old Gen: 54648149088 -> 12146746552; G1 Survivor Space: 134217728 -> 0;
INFO [Service Thread] 2023-07-14 02:14:24,394 GCInspector.java:285 - G1 Old Generation GC in 57918ms. G1 Old Gen: 67780410096 -> 59704216816; Metaspace: 61436792 -> 61302392
INFO [Service Thread] 2023-07-14 02:15:44,506 GCInspector.java:285 - G1 Old Generation GC in 64576ms. G1 Old Gen: 67971190408 -> 64736391536;
INFO [Service Thread] 2023-07-14 02:17:06,520 GCInspector.java:285 - G1 Old Generation GC in 66242ms. G1 Old Gen: 68043573704 -> 66792790424;
INFO [Service Thread] 2023-07-14 02:21:31,210 GCInspector.java:285 - G1 Old Generation GC in 257268ms. G1 Old Gen: 68046631936 -> 67703054448;
254 | CHANGED | rc=0 >>
INFO [Service Thread] 2023-07-14 02:36:26,170 GCInspector.java:285 - G1 Old Generation GC in 46654ms. G1 Old Gen: 133621345752 -> 49403423024; Metaspace: 67436096 -> 67339688
INFO [Service Thread] 2023-07-14 02:38:58,627 GCInspector.java:285 - G1 Old Generation GC in 89392ms. G1 Old Gen: 133594285096 -> 103157948104;
INFO [Service Thread] 2023-07-14 02:40:59,754 GCInspector.java:285 - G1 Old Generation GC in 93345ms. G1 Old Gen: 135071359720 -> 105377369048; G1 Survivor Space: 33554432 -> 0;
INFO [Service Thread] 2023-07-14 02:43:29,171 GCInspector.java:285 - G1 Old Generation GC in 106174ms. G1 Old Gen: 133812654600 -> 119264140552; G1 Survivor Space: 234881024 -> 0;
INFO [Service Thread] 2023-07-14 02:45:36,900 GCInspector.java:285 - G1 Old Generation GC in 95625ms. G1 Old Gen: 135225564784 -> 99943593104;
INFO [Service Thread] 2023-07-14 02:46:53,820 GCInspector.java:285 - G1 Old Generation GC in 55875ms. G1 Old Gen: 133359614104 -> 60924511688; G1 Survivor Space: 872415232 -> 0;
INFO [Service Thread] 2023-07-14 02:48:22,803 GCInspector.java:285 - G1 Old Generation GC in 38493ms. G1 Old Gen: 133978126912 -> 36277631424;
INFO [Service Thread] 2023-07-14 02:50:11,320 GCInspector.java:285 - G1 Old Generation GC in 34789ms. G1 Old Gen: 134004918888 -> 35377344368;
250 | CHANGED | rc=0 >>
INFO [Service Thread] 2023-07-14 00:18:52,262 GCInspector.java:285 - G1 Old Generation GC in 96017ms. G1 Old Gen: 73628910144 -> 59159105432; Metaspace: 58018496 -> 57907432
INFO [Service Thread] 2023-07-14 00:46:41,400 GCInspector.java:285 - G1 Old Generation GC in 30177ms. G1 Old Gen: 41448088568 -> 24094354384; G1 Survivor Space: 67108864 -> 0;
INFO [Service Thread] 2023-07-14 02:18:34,910 GCInspector.java:285 - G1 Old Generation GC in 40940ms. G1 Old Gen: 74016882928 -> 27759131352; Metaspace: 57315192 -> 57128720
INFO [Service Thread] 2023-07-14 02:36:02,256 GCInspector.java:285 - G1 Old Generation GC in 57658ms. G1 Old Gen: 73488401080 -> 40838191112; Metaspace: 54701984 -> 54651552
INFO [Service Thread] 2023-07-14 02:37:47,374 GCInspector.java:285 - G1 Old Generation GC in 87036ms. G1 Old Gen: 73498188264 -> 65920831896;
INFO [Service Thread] 2023-07-14 02:39:58,921 GCInspector.java:285 - G1 Old Generation GC in 111435ms. G1 Old Gen: 73496794000 -> 70079092144;
In the latest months I've tried several things, like:
Currently, I don't have the following options:
Right now, the only resource that I have is using "disablebinary" on these nodes for avoiding peaks, but that it's not good.
The JVM
INFO [main] 2023-07-17 18:40:11,668 CassandraDaemon.java:507 - JVM Arguments: [-javaagent:/opt/simility/include/exporters/jmxexporter/jmx_prometheus_javaagent-0.12.0.jar=7070:/opt/simility/include/exporters/jmxexporter/cassandra.yml, -ea, -javaagent:/opt/simility/include/cassandra/lib/jamm-0.3.0.jar, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -Xms64G, -Xmx64G, -XX:+ExitOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+UseG1GC, -XX:G1RSetUpdatingPauseTimePercent=5, -XX:MaxGCPauseMillis=500, -Djava.net.preferIPv4Stack=true, -Dcassandra.config=file:///opt/simility/conf/cassandra/cassandra.yaml, -Djava.rmi.server.hostname=172.33.135.28, -Dcom.sun.management.jmxremote.port=7199, -Dcom.sun.management.jmxremote.rmi.port=7199, -Dcom.sun.management.jmxremote.ssl=false, -Dcom.sun.management.jmxremote.authenticate=false, -Dcassandra.libjemalloc=/lib64/libjemalloc.so.1, -Dlogback.configurationFile=logback.xml, -Dcassandra.config=file:///opt/simility/conf/cassandra/cassandra.yaml, -Dcassandra.logdir=/opt/simility/log/cassandra, -Dcassandra.storagedir=/opt/simility/include/cassandra/data]
Thank you!
several things, as explained in the last message
Upvotes: 1
Views: 518
Reputation: 466
The number one main reason for this type of behavior is humongous allocations. I wrote up a paper on it here:
You can figure out if the issue is humongous allocations by adding the following to your jvm settings:
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1
-Xloggc:<file-path>
-XX:+PrintAdaptiveSizePolicy
Once you have the gc logs, you can look for humongous allocations in the gc logs:
14999: 6153.047: [G1Ergonomics (Concurrent Cycles) do not request concurrent cycle initiation, reason: still doing mixed collections, occupancy: 18253611008 bytes, allocation request: 4299032 bytes, threshold: 9663676380 bytes (45.00 %), source: concurrent humongous allocation]
That's a 4.2MiB allocation. In this case, the heap region size was 8MiB, so a humongous allocation is anything larger than half the region size. And the region size is computed as:
Heap size in MiB / 2048 = XMiB rounded down to the nearest multiple of 2, ie. 2, 4, 8, 16, 32…where 32 is max).
You can also see the heap region size in your gc logs:
14.414: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: Metadata GC Threshold]
{Heap before GC invocations=0 (full 0):
garbage-first heap total 20971520K, used 212992K [0x00000002c0000000, 0x00000002c0805000, 0x00000007c0000000)
region size 8192K, 27 young (221184K), 0 survivors (0K)
In this case it's 8192K.
If you do see humongous allocations, then what you can do is either increase the heap size, -Xmx parameter, or you can increase the region size:
-XX:G1HeapRegionSize=32M
32M is the max.
Ideally, that's not the solution, the solution is to get those mutation sizes down, but that should buy you some leeway. If it doesn't then you may have a situation where you have one object that's causing you headaches. I've seen that before as well. To identify that issue, you'll need to get a memory dump just before the issue hits, which is not easy. In an ideal world, you'll have HeapDumpOnOutOfMemory set and it'll OOM and then you'll have a heap dump to analyze with Eclipse MAT.
Upvotes: 2