Krzysztof Genge
Krzysztof Genge

Reputation: 11

Can not get rid of tombstones in cassandra 2.1.8 using (STCS) SizeTieredCompactionStrategy

I have a 3 nodes cassandra (2.1.8) cluster on which I am running application using titan db (v0.5.4). The amount of data is very small (<20 MB) but as my use case require deletes from time to time I already have problems with tombstones. I can not get rid of already created tombstones. The solutions I tried are:

As a result the statistics lowered a bit but Average tombstones per slice and Maximum tombstones per slice are still not satisfying:

Table: graphindex
    **SSTable count: 1**
    Space used (live): 661873
    Space used (total): 661873
    Space used by snapshots (total): 0
    Off heap memory used (total): 6544
    SSTable Compression Ratio: 0.6139286819777781
    Number of keys (estimate): 4082
    Memtable cell count: 0
    Memtable data size: 0
    Memtable off heap memory used: 0
    Memtable switch count: 15
    Local read count: 25983
    Local read latency: 0.931 ms
    Local write count: 23610
    Local write latency: 0.057 ms
    Pending flushes: 0
    Bloom filter false positives: 0
    Bloom filter false ratio: 0.00000
    Bloom filter space used: 5208
    Bloom filter off heap memory used: 5200
    Index summary off heap memory used: 1248
    Compression metadata off heap memory used: 96
    Compacted partition minimum bytes: 43
    Compacted partition maximum bytes: 152321
    Compacted partition mean bytes: 203
    Average live cells per slice (last five minutes): 728.4188892737559
    Maximum live cells per slice (last five minutes): 4025.0
    **Average tombstones per slice (last five minutes): 317.34938228841935**
    **Maximum tombstones per slice (last five minutes): 8031.0**

Is there any option to remove all tombstones?. Thanks in advance for any suggestion.

Upvotes: 0

Views: 365

Answers (1)

Krzysztof Genge
Krzysztof Genge

Reputation: 11

The problem is solved.

It turned out that the information about the statistics is very misleading as the 'Average tombstones per slice (last five minutes)' and 'Maximum tombstones per slice (last five minutes)' and probably live cells statistics are not counted in last 5 minutes is it is written by nodetool cfstats. But they are calculated since the node startup. My nodes were running for few months so even though the tombstones were cleared I could not notice big difference as the scale of days with already high statistic values was so big. After I restarted the nodes the statistics cleared up and I could see that the compaction took effect.

Its a shame that the information about this bug in statistic description was so hard to find for me (https://issues.apache.org/jira/browse/CASSANDRA-7731)

Hope this could help someone to get to this information sooner.

Upvotes: 1

Related Questions