Reputation: 101
Trying to find out why a cassandra read is taking so long, I used tracing and limited the number of rows. Strangely, when I query 600 rows, I get results in ~50 milliseconds. But 610 rows takes nearly 1 second!
cqlsh> select containerdefinitionid from containerdefinition limit 600;
... lots of output ...
Tracing session: 6b506cd0-83bc-11e3-96e8-e182571757d7
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------------+--------------+---------------+----------------
execute_cql3_query | 15:25:02,878 | 130.4.147.116 | 0
Parsing statement | 15:25:02,878 | 130.4.147.116 | 39
Peparing statement | 15:25:02,878 | 130.4.147.116 | 101
Determining replicas to query | 15:25:02,878 | 130.4.147.116 | 152
Executing seq scan across 1 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 15:25:02,879 | 130.4.147.116 | 1021
Scanned 755 rows and matched 755 | 15:25:02,933 | 130.4.147.116 | 55169
Request complete | 15:25:02,934 | 130.4.147.116 | 56300
cqlsh> select containerdefinitionid from containerdefinition limit 610;
... just about the same output and trace info, except...
Scanned 766 rows and matched 766 | 15:25:58,908 | 130.4.147.116 | 739141
There seems to be nothing unusual about the data in those particular rows: - values are similar to those before and after. - using the COPY command I can export the whole table and import on a different cluster and performance is fine. - these rows are the first example, but there seem to be other places where query time jumps as well. Whole table is only ~3000 rows but takes ~15sec to list all primary keys.
There does seem to be something unusual about the data STORAGE: - snapshot copied to another cluster and imported gives same results with same limits - COPY data to CSV and then into another cluster does not, performance is great
Have tried compaction, repair, reindex, cleanup and refresh. No effect.
I realize I could "fix" by copying data out and in, but I'm trying to figure out what is going on here to avoid it happening in production on a table too big to fix with COPY.
Table has 17 columns, 3 indices, TEXT primary key, two LIST columns and two TIMESTAMP columns; the rest are TEXT. Can reproduce issue with both SimpleStrategy and DC-aware replication. Can reproduce with 4 copies of data on 4 servers, 2 copies on 2 servers and 1 copy on 2 servers (so doesn't matter if query is performed locally or involves multiple servers). Cassandra-1.2 with cqlsh.
Any ideas? Suggestions?
Upvotes: 2
Views: 260
Reputation: 4600
Any chance you have row cache enabled for a particular partition? The row cache contains all the rows that are recently accessed in memory, thus may be providing a much better performance.
Key caches which contains a cache of partition-keys and their offsets on disk can also provide a much better performance.
Can you let me know what setting are you currently using for row cache, key cache
Upvotes: 0