Reputation: 31
We use Cassandra 2.0.10, and have a 5-node cluster. Sometime, we get a large number of SliceQueryFilter.java (line 225) Read 2 live and 1056 tombstoned cells ...
messages in Cassandra log on one particular node, and the node bring down the whole database performance. We have to restart cassandra service on that node to solve the performance issue.
Does anyone see what could be the root cause of this, and how to fix it?
Upvotes: 2
Views: 584
Reputation: 57748
Read 2 live and 1056 tombstones cells
It sounds like you are dealing with a poor data model. This is what happens when you have a model that supports a high number of DELETE operations. For the message you mentioned above, that query had to sort through 1056 tombstones just to return 2 values that the application actually cared about. Cassandra doesn't do well with DELETEs. So if you plan to support DELETEs, then your model needs to be designed to mitigate tombstone placement.
The way around this is to have your application team model the table for these queries in such a way that supports immutable writes. This usually means re-working the table as a time-series. Of course, without seeing the offending model, I can only speculate.
on one particular node
Does this always happen on the same node? If so, then it sounds like you may be falling into another data modeling trap where too much data is written to a single partition, creating a "hot spot" in your cluster.
If it is not always the same node, then it sounds like a node is being used as a coordinator to execute too many requests. Make sure your application team is using the TokenAwareLoadBalancingPolicy
in their driver code, and that they are not using BATCH statements incorrectly.
How do you know if BATCH is being used incorrectly?
If BATCH is used to provide atomic updates across a single partition, then it is being used properly. If BATCH is being used to improve performance while applying a series of updates in a single network trip, then it is being used incorrectly. If you are using Spring Data Cassandra it actually does this behind the scenes when persisting a list of objects.
Upvotes: 3