Cannot collect metrics in Cassandra

Question

I am trying to report Cassandra 3.0 metrics to Graphite server using metrics-graphite as suggested here http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2. When there is no load on the cluster everything works fine and all metrics are reported properly. But if some load occurs, I receive following exception in system.log:

ERROR [metrics-graphite-reporter-1-thread-1] 2016-07-13 08:21:23,580 ScheduledReporter.java:119 - RuntimeException thrown from GraphiteReporter#report. Exception was suppressed.
java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed
        at org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231) ~[apache-cassandra-3.0.7.jar:3.0.7]
        at org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getMean(EstimatedHistogramReservoir.java:103) ~[apache-cassandra-3.0.7.jar:3.0.7]
        at com.codahale.metrics.graphite.GraphiteReporter.reportHistogram(GraphiteReporter.java:265) ~[metrics-graphite-3.1.2.jar:3.1.2]
        at com.codahale.metrics.graphite.GraphiteReporter.report(GraphiteReporter.java:179) ~[metrics-graphite-3.1.2.jar:3.1.2]
        at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162) ~[metrics-core-3.1.0.jar:3.1.0]
        at com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:117) ~[metrics-core-3.1.0.jar:3.1.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_91]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]

This message is repeated every time the reporter tries to get metrics on every Cassandra node and some metrics become unavailable. In order to receive the metrics again, I have to restart all Cassandra nodes, that is very impractical. I tried different metrics-graphite versions from 3.1.0 to 3.1.2 with the same issue.

Cannot collect metrics in Cassandra

Answers (1)

Related Questions