Reputation: 4017
I have a table that used to have lots of data but this data died due to ttl long time ago.
But when I run queries such as SELECT * FROM my_table LIMIT 10
or SELECT count(*) FROM my_table
runs out of few-second timeout.
When I read the data using spark-cassandra-connector (which queries the table by token ranges internally) it takes few minutes to actually get 0 entries.
I suspected that something is wrong with compaction and tombstones are not deleted but table data directory doesn't look huge:
/var/lib/scylla$ sudo ls -la
/data/scylla/data/my_space/my_table-75f8388035c211e9bc85000000000011
total 2604 drwx------ 4 scylla scylla 8192 Nov 27 15:55 .
drwx------ 215 scylla scylla 16384 Nov 15 19:00 ..
-rw-r--r-- 1 scylla scylla 538 Nov 25 22:58 mc-1552402-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 162869 Nov 25 22:58 mc-1552402-big-Data.db
-rw-r--r-- 1 scylla scylla 10 Nov 25 22:58 mc-1552402-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 656 Nov 25 22:58 mc-1552402-big-Filter.db
-rw-r--r-- 1 scylla scylla 72244 Nov 25 22:58 mc-1552402-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 22:58 mc-1552402-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 22:58 mc-1552402-big-Statistics.db
-rw-r--r-- 1 scylla scylla 176 Nov 25 22:58 mc-1552402-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 22:58 mc-1552402-big-TOC.txt
-rw-r--r-- 1 scylla scylla 474 Nov 25 22:38 mc-1578193-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 135730 Nov 25 22:38 mc-1578193-big-Data.db
-rw-r--r-- 1 scylla scylla 10 Nov 25 22:38 mc-1578193-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 496 Nov 25 22:38 mc-1578193-big-Filter.db
-rw-r--r-- 1 scylla scylla 67215 Nov 25 22:38 mc-1578193-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 22:38 mc-1578193-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 22:38 mc-1578193-big-Statistics.db
-rw-r--r-- 1 scylla scylla 148 Nov 25 22:38 mc-1578193-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 22:38 mc-1578193-big-TOC.txt
-rw-r--r-- 1 scylla scylla 202 Nov 25 23:12 mc-1582107-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 49607 Nov 25 23:12 mc-1582107-big-Data.db
-rw-r--r-- 1 scylla scylla 10 Nov 25 23:12 mc-1582107-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 176 Nov 25 23:12 mc-1582107-big-Filter.db
-rw-r--r-- 1 scylla scylla 24733 Nov 25 23:12 mc-1582107-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 23:12 mc-1582107-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 23:12 mc-1582107-big-Statistics.db
-rw-r--r-- 1 scylla scylla 92 Nov 25 23:12 mc-1582107-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 23:12 mc-1582107-big-TOC.txt
-rw-r--r-- 1 scylla scylla 1826 Nov 25 22:39 mc-1601869-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 639833 Nov 25 22:39 mc-1601869-big-Data.db
-rw-r--r-- 1 scylla scylla 10 Nov 25 22:39 mc-1601869-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 2096 Nov 25 22:39 mc-1601869-big-Filter.db
-rw-r--r-- 1 scylla scylla 251799 Nov 25 22:39 mc-1601869-big-Index.db
-rw-r--r-- 1 scylla scylla 98 Nov 25 22:39 mc-1601869-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 22:39 mc-1601869-big-Statistics.db
-rw-r--r-- 1 scylla scylla 428 Nov 25 22:39 mc-1601869-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 22:39 mc-1601869-big-TOC.txt
-rw-r--r-- 1 scylla scylla 514 Nov 25 23:40 mc-1664251-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 146270 Nov 25 23:40 mc-1664251-big-Data.db
-rw-r--r-- 1 scylla scylla 10 Nov 25 23:40 mc-1664251-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 496 Nov 25 23:40 mc-1664251-big-Filter.db
-rw-r--r-- 1 scylla scylla 69800 Nov 25 23:40 mc-1664251-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 23:40 mc-1664251-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 23:40 mc-1664251-big-Statistics.db
-rw-r--r-- 1 scylla scylla 148 Nov 25 23:40 mc-1664251-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 23:40 mc-1664251-big-TOC.txt
-rw-r--r-- 1 scylla scylla 522 Nov 25 23:39 mc-1679659-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 147350 Nov 25 23:39 mc-1679659-big-Data.db
-rw-r--r-- 1 scylla scylla 9 Nov 25 23:39 mc-1679659-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 496 Nov 25 23:39 mc-1679659-big-Filter.db
-rw-r--r-- 1 scylla scylla 70504 Nov 25 23:39 mc-1679659-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 23:39 mc-1679659-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 23:39 mc-1679659-big-Statistics.db
-rw-r--r-- 1 scylla scylla 148 Nov 25 23:39 mc-1679659-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 23:39 mc-1679659-big-TOC.txt
-rw-r--r-- 1 scylla scylla 562 Nov 25 23:05 mc-1742082-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 181106 Nov 25 23:05 mc-1742082-big-Data.db
-rw-r--r-- 1 scylla scylla 8 Nov 25 23:05 mc-1742082-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 656 Nov 25 23:05 mc-1742082-big-Filter.db
-rw-r--r-- 1 scylla scylla 69320 Nov 25 23:05 mc-1742082-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 23:05 mc-1742082-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 23:05 mc-1742082-big-Statistics.db
-rw-r--r-- 1 scylla scylla 176 Nov 25 23:05 mc-1742082-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 23:05 mc-1742082-big-TOC.txt
-rw-r--r-- 1 scylla scylla 554 Nov 25 23:24 mc-1774537-big-CompressionInfo.db
-rw-r--r-- 1 scylla scylla 176335 Nov 25 23:24 mc-1774537-big-Data.db
-rw-r--r-- 1 scylla scylla 9 Nov 25 23:24 mc-1774537-big-Digest.crc32
-rw-r--r-- 1 scylla scylla 656 Nov 25 23:24 mc-1774537-big-Filter.db
-rw-r--r-- 1 scylla scylla 67974 Nov 25 23:24 mc-1774537-big-Index.db
-rw-r--r-- 1 scylla scylla 54 Nov 25 23:24 mc-1774537-big-Scylla.db
-rw-r--r-- 1 scylla scylla 6234 Nov 25 23:24 mc-1774537-big-Statistics.db
-rw-r--r-- 1 scylla scylla 176 Nov 25 23:24 mc-1774537-big-Summary.db
-rw-r--r-- 1 scylla scylla 102 Nov 25 23:24 mc-1774537-big-TOC.txt drwx------ 2 scylla scylla 6 Feb 21 2019 staging drwx------ 2 scylla scylla 6 Feb 21 2019 upload
Modification times are quite fresh so probably some compaction actually happened.
My table config:
CREATE TABLE my_table (
f1 uuid,
f2 tinyint,
c boolean,
PRIMARY KEY (f1, f2)
) WITH CLUSTERING ORDER BY (f1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
AND comment = ''
AND compaction = {'class': 'SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
What could be the problem?
Upvotes: 1
Views: 946
Reputation: 620
Tombstones are kept alive for gc_grace_period
which is 10 days by default. This is to prevent data resurrection, that is to assure each node that has data falling under the tombstones receive that tombstone (via repair).
Your query is probably reading through thousands of tombstones, not finding anything to return.
Upvotes: 5
Reputation: 556
nodetool cfstats
https://docs.scylladb.com/operating-scylla/nodetool-commands/cfstats/ to check known table size and avg of tombstones to what you see on disk.You will be able to move from there considering the results.
Upvotes: 5