chnet
chnet

Reputation: 2033

Cassandra DateTieredCompactionStrategy SSTables

We are running Cassandra 2.2.3.

Have a column family using DateTieredCompactionStrategy as follows,

CREATE TABLE test (
     num_id text,
     position_time timestamp,
     acc int,
     coordinate text,
     device_no text,
     PRIMARY KEY (num_id, position_time, coordinate)
 ) WITH CLUSTERING ORDER BY (position_time DESC, coordinate ASC)
     AND bloom_filter_fp_chance = 0.01
     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
     AND comment = 'table for gps points from car gps source.'
     AND compaction = {'timestamp_resolution': 'MILLISECONDS', 'max_sstable_age_days': '8', 'base_time_seconds': '3600', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
     AND compression = {'chunk_length_kb': '64', 'crc_check_chance': '1.0', 'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}
     AND dclocal_read_repair_chance = 0.1
     AND default_time_to_live = 0
     AND gc_grace_seconds = 86400
     AND max_index_interval = 2048
     AND memtable_flush_period_in_ms = 0
     AND min_index_interval = 128
     AND read_repair_chance = 0.0
     AND speculative_retry = '99.0PERCENTILE';

We do have some traffic, It keeps inserting the table.

Cassandra generates many SStables, around 2,000 in total. For example,

-rw-r--r-- 1 cassandra cassandra   86M Jan 20 02:59 la-11110-big-Data.db
-rw-r--r-- 1 cassandra cassandra  111M Jan 20 03:11 la-11124-big-Data.db
-rw-r--r-- 1 cassandra cassandra  176M Jan 20 03:12 la-11125-big-Data.db
-rw-r--r-- 1 cassandra cassandra  104M Jan 20 03:14 la-11130-big-Data.db
-rw-r--r-- 1 cassandra cassandra  102M Jan 20 03:26 la-11144-big-Data.db
-rw-r--r-- 1 cassandra cassandra  172M Jan 20 03:26 la-11145-big-Data.db
-rw-r--r-- 1 cassandra cassandra  107M Jan 20 03:30 la-11149-big-Data.db
-rw-r--r-- 1 cassandra cassandra   96M Jan 20 03:41 la-11163-big-Data.db
-rw-r--r-- 1 cassandra cassandra  176M Jan 20 03:41 la-11164-big-Data.db
-rw-r--r-- 1 cassandra cassandra   97M Jan 20 03:45 la-11169-big-Data.db
-rw-r--r-- 1 cassandra cassandra   82M Jan 20 03:57 la-11183-big-Data.db
-rw-r--r-- 1 cassandra cassandra  194M Jan 20 03:58 la-11184-big-Data.db
-rw-r--r-- 1 cassandra cassandra   28M Jan 20 03:59 la-11187-big-Data.db
-rw-r--r-- 1 cassandra cassandra   90M Jan 20 04:00 la-11188-big-Data.db

My question is, Is it normal to have so many SStables (2000)?

The other thing is we are experiencing readtimeout Exception for selection query. The selection query uses primary key num_id and clustering key timestamp. The readtimeout is set to 10 seconds.

So, the other question is the readtimeout exception is caused by many SStables or wide row? How to solve avoid this exception?

Upvotes: 1

Views: 346

Answers (2)

Marcus Eriksson
Marcus Eriksson

Reputation: 118

the problem is "'timestamp_resolution': 'MILLISECONDS'" I've filed https://issues.apache.org/jira/browse/CASSANDRA-11041 to improve documentation about this parameter

Upvotes: 2

doanduyhai
doanduyhai

Reputation: 8812

My question is, Is it normal to have so many SStables (2000)?

No, it's not normal. I think that in your case, the compaction is not fast enough to keep up with the ingestion rate. What kind of hard drive do you have for the Cassandra server ? Spinning disk ? SSD ? Shared storage ?

So, the other question is the readtimeout exception is caused by many SStables or wide row?

It can be both, but in your case, I'm pretty sure that it's related to the huge number of SSTables

How to solve avoid this exception?

Check that your disk I/O can keep up. Use dstat and iostat Linux tool to monitor the I/O

Upvotes: 2

Related Questions