Reputation: 97
I wish to delete a large amount of rows from a particular table
I did the following steps: 1) Set gc_grace_seconds = 0 for the table 2) Deleted a large number of rows ~1 million 3) Ran ./nodetool compact keyspace_name table_name
However when I ran nodetool compact(Step 3) nothing happens. It does not start compaction. Due to the large number of tombstones most of my requests now timeout as well.
The table has the following settings:
AND bloom_filter_fp_chance = 0.001
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'tombstone_threshold': '0.2', 'tombstone_compaction_interval': '86400', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
I wish to compact and get rid of the tombstones so that I can actually get rid of the unwanted data.
I have two nodes in my cluster with replication factor 2 Since I did the deletes the difference in size between the two has increased. There is a difference of about 700MB. I am using dsc-cassandra-2.1.10
cfstats are shown below
Keyspace: keyspace1
Read Count: 16316
Read Latency: 12.23892982348615 ms.
Write Count: 11078808
Write Latency: 0.6955001765532899 ms.
Pending Flushes: 0
Table: table1
SSTable count: 92
SSTables in each level: [1, 4, 38, 49, 0, 0, 0, 0, 0]
Space used (live): 38247164244
Space used (total): 38247164244
Space used by snapshots (total): 26692664189
Off heap memory used (total): 14695952
SSTable Compression Ratio: 0.32499125289530584
Number of keys (estimate): 2788
Memtable cell count: 16632
Memtable data size: 1839846
Memtable off heap memory used: 0
Memtable switch count: 93
Local read count: 16316
Local read latency: 12.239 ms
Local write count: 11078808
Local write latency: 0.696 ms
Pending flushes: 0
Bloom filter false positives: 331
Bloom filter false ratio: 0.00000
Bloom filter space used: 10960
Bloom filter off heap memory used: 10224
Index summary off heap memory used: 3672
Compression metadata off heap memory used: 14682056
Compacted partition minimum bytes: 216
Compacted partition maximum bytes: 3449259151
Compacted partition mean bytes: 25823653
Average live cells per slice (last five minutes): 405.3014160485502
Maximum live cells per slice (last five minutes): 5002.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0
Upvotes: 0
Views: 2610
Reputation: 477
compaction strategy dictates the behavior of nodetool compact and there are subtle differences in the api between versions
http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsCompact.html vs https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCompact.html
To remove the data and tombstones:
Executing a major compaction and switching between compaction strategies is an IO intensive operation- please take that into account.
Upvotes: 0