kostas.kougios
kostas.kougios

Reputation: 993

cassandra, bad performance of time series table

I got a 3x nodes cluster (on the same 16 core box, in virtual box via lxc but each node on a 3TB disk on it's own).

My table is this:

CREATE TABLE history (
 id text,
 idx bigint,
 data bigint,
 PRIMARY KEY (id, idx)
) WITH CLUSTERING ORDER BY (idx DESC)

id will store an id which is a string , idx is a time in ms and data are my data. According to all examples I found, this seems to be a correct schema for time series data.

My query is :

select idx,data from history where id=? limit 2

This returns the 2 most recent (based on idx) rows.

Since id is the partition key and idx the clustering key, docs I found claim that this is very performant with cassandra. But my benchmarks say otherwise.

I've populated a 400GB in total (split in those 3 nodes) and now I am running queries from a 2ndary box. Using 16 or 32 threads, I am running the mentioned query but the performance is really low for 3 nodes running on 3 separate disks:

throughput: 61         avg time: 614,808 μs
throughput: 57         avg time: 519,651 μs
throughput: 52         avg time: 569,245 μs

So , ~55 queries per second, each query taking half second (sometimes they do take 200ms)

I find this really low.

Can someone please tell me if my schema is correct and if not suggest a schema? If my schema is correct, how can I find what is going wrong?

Disk IO on the 16core box:

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               0.00         0.00         0.00          0          0
sdb             135.00         6.76         0.00          6          0
sdc             149.00         6.99         0.00          6          0
sdd             124.00         7.21         0.00          7          0

The cassandras don't use more than 1 cpu core each.

EDIT: With tracing on I get a lot of lines like the following when I run a simple query for 1 id:

                                            Key cache hit for sstable 33259 | 20:16:26,699 | 127.0.0.1 |           5830
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           5833
                                  Bloom filter allows skipping sstable 33256 | 20:16:26,699 | 127.0.0.1 |           5923
                                  Bloom filter allows skipping sstable 33255 | 20:16:26,699 | 127.0.0.1 |           5932
                                  Bloom filter allows skipping sstable 33252 | 20:16:26,699 | 127.0.0.1 |           5938
                                             Key cache hit for sstable 33247 | 20:16:26,699 | 127.0.0.1 |           5948
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           5951
                                  Bloom filter allows skipping sstable 33246 | 20:16:26,699 | 127.0.0.1 |           6072
                                  Bloom filter allows skipping sstable 33243 | 20:16:26,699 | 127.0.0.1 |           6081
                                             Key cache hit for sstable 33242 | 20:16:26,699 | 127.0.0.1 |           6092
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6095
                                  Bloom filter allows skipping sstable 33240 | 20:16:26,699 | 127.0.0.1 |           6187
                                             Key cache hit for sstable 33237 | 20:16:26,699 | 127.0.0.1 |           6198
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6201
                                             Key cache hit for sstable 33235 | 20:16:26,699 | 127.0.0.1 |           6297
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6301
                                  Bloom filter allows skipping sstable 33234 | 20:16:26,699 | 127.0.0.1 |           6393
                                             Key cache hit for sstable 33229 | 20:16:26,699 | 127.0.0.1 |           6404
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6408
                                  Bloom filter allows skipping sstable 33228 | 20:16:26,699 | 127.0.0.1 |           6496
                                             Key cache hit for sstable 33227 | 20:16:26,699 | 127.0.0.1 |           6508
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6511
                                             Key cache hit for sstable 33226 | 20:16:26,699 | 127.0.0.1 |           6601
                                 Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |           6605
                                             Key cache hit for sstable 33225 | 20:16:26,700 | 127.0.0.1 |           6692
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           6696
                                             Key cache hit for sstable 33223 | 20:16:26,700 | 127.0.0.1 |           6785
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           6789
                                             Key cache hit for sstable 33221 | 20:16:26,700 | 127.0.0.1 |           6876
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           6880
                                  Bloom filter allows skipping sstable 33219 | 20:16:26,700 | 127.0.0.1 |           6967
                                             Key cache hit for sstable 33377 | 20:16:26,700 | 127.0.0.1 |           6978
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           6981
                                             Key cache hit for sstable 33208 | 20:16:26,700 | 127.0.0.1 |           7071
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7075
                                             Key cache hit for sstable 33205 | 20:16:26,700 | 127.0.0.1 |           7161
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7166
                                  Bloom filter allows skipping sstable 33201 | 20:16:26,700 | 127.0.0.1 |           7251
                                  Bloom filter allows skipping sstable 33200 | 20:16:26,700 | 127.0.0.1 |           7260
                                             Key cache hit for sstable 33195 | 20:16:26,700 | 127.0.0.1 |           7276
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7279
                                  Bloom filter allows skipping sstable 33191 | 20:16:26,700 | 127.0.0.1 |           7363
                                             Key cache hit for sstable 33190 | 20:16:26,700 | 127.0.0.1 |           7374
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7377
                                  Bloom filter allows skipping sstable 33189 | 20:16:26,700 | 127.0.0.1 |           7463
                                             Key cache hit for sstable 33186 | 20:16:26,700 | 127.0.0.1 |           7474
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7477
                                             Key cache hit for sstable 33183 | 20:16:26,700 | 127.0.0.1 |           7563
                                 Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |           7567
                                  Bloom filter allows skipping sstable 33182 | 20:16:26,701 | 127.0.0.1 |           7663
                                  Bloom filter allows skipping sstable 33180 | 20:16:26,701 | 127.0.0.1 |           7672
                                  Bloom filter allows skipping sstable 33178 | 20:16:26,701 | 127.0.0.1 |           7679
                                  Bloom filter allows skipping sstable 33177 | 20:16:26,701 | 127.0.0.1 |           7686

Maybe most important is the end of the trace:

                                Merging data from memtables and 277 sstables | 20:21:29,186 | 127.0.0.1 |         607001
                                          Read 3 live and 0 tombstoned cells | 20:21:29,186 | 127.0.0.1 |         607205
                                                            Request complete | 20:21:29,186 | 127.0.0.1 |         607714

Upvotes: 1

Views: 464

Answers (1)

Tupshin Harper
Tupshin Harper

Reputation: 1297

Do look at tracing to confirm, but if sdb,sdc, and sdd are spinning disks, you are seeing the correct order of magnitude of tps, and are very likely random disk I/O bound on the read-side.

If that is the case, then you only have two options (with any system, not specific to Cassandra):

  1. Switch to SSDs. My personal testing has demonstrated up to 3 orders of magnitude increased random read performance when the workload was entirely bound by the tps of the disks.
  2. Ensure that a very large percentage of your reads are cached. If you are doing random reads across 400GB of data, that is probably not going to be feasible.\

Cassandra can do roughly 3k-5K operations (read or write) per CPU core, but only if the disk subsystem isn't the limiting factor.

Upvotes: 2

Related Questions