Reputation: 93
I have problems with cassandra:
if I do nodetool -h 10.169.20.8 cfstats name.name -H
I get results and stats is like this:
Read Count: 0
Read Latency: NaN ms.
Write Count: 739812
Write Latency: 0.038670616318740435 ms.
Pending Flushes: 0
Table: name
SSTable count: 10
Space used (live): 1.48 GB
Space used (total): 1.48 GB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 3.04 MB
SSTable Compression Ratio: 0.5047407001982581
Number of keys (estimate): 701190
Memtable cell count: 22562
Memtable data size: 14.12 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 7
Local read count: 0
Local read latency: NaN ms
Local write count: 739812
Local write latency: 0.043 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 2.39 MB
Bloom filter off heap memory used: 2.39 MB
Index summary off heap memory used: 302.03 KB
Compression metadata off heap memory used: 366.3 KB
Compacted partition minimum bytes: 87 bytes
Compacted partition maximum bytes: 3.22 MB
Compacted partition mean bytes: 2.99 KB
Average live cells per slice (last five minutes): 1101.2357892212697
Maximum live cells per slice (last five minutes): 1109
Average tombstones per slice (last five minutes): 271.6848030693603
Maximum tombstones per slice (last five minutes): 1109
Dropped Mutations: 0 bytes
Why tombstones stats is not 0? We here only write into Cassandra, no one deleted records. We dont use TTL, the are set to default settings.
Second problem (probably connected to the issue) - number of rows of tables changes randomly, we dont understand what is going on.
Upvotes: 0
Views: 798
Reputation: 583
I know that the question and issue back to some years ago but in case someone having same issue with new cassandra versions 3+ and want to remove deleted data he/she can run nodetool garbagecollect
Upvotes: 0
Reputation: 2389
writing a value of in a column is the same as a deletion and causes a tombstone. Wait... Say What.
Upvotes: 0
Reputation: 794
N.B. : sometimes tombstones could be created using nulls bindings in prepared statements - http://thelastpickle.com/blog/2016/09/15/Null-bindings-on-prepared-statements-and-undesired-tombstone-creation.html
Upvotes: 0
Reputation: 477
I am not sure there is a way to explain the tombstones - if you are not doing any deletes.
I can provide you two methods to try and analyze this - maybe this will help understand better what is hapenning and how.
There is a tool named sstable2json - it allows taking an sstable and dumping it to json -
for example for the following schema
cqlsh> describe schema;
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE test.t1 (
key text PRIMARY KEY,
value text
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
running sstable2json on an sstable file with a tombstone for a complete partition provides the folliowing
[
{"key": "key",
"metadata": {"deletionInfo": {"markedForDeleteAt":1475270192779047,"localDeletionTime":1475270192}},
"cells": []}
]
and in this case the markjer is for the partition using "key"
Another method you can use (given that the tombstone count is increasing) is to use a tcpdump and then analyze it with wireshark. Benoit Canet from ScyllaDB contributed to wireshark a dissector supporting CQL that is now in the latest stable release 2.2.0 (https://www.wireshark.org/docs/relnotes/wireshark-2.2.0.html)
Please note that cql deletes can actually be found in two types QUERY and PREPARED (if deletes are done using prepared statements).
If they are done via prepared statements you may need to drop the CQL connections to make sure you catch the specific packets that have the prepared statements.
Here is a sample from wireshark capturing the delete statement from above
Upvotes: 1