Reputation: 186
I am working on Azure Managed Cassandra and currently observing the read performance issue while read data from one partition. Cassandra Key space & table details as below:
Created Keyspace using SimpleStrategy in one data center with RF as 3. Below is the table description:
CREATE TABLE ks1.table1 (
item text,
market text,
location int,
brand text,
channel text,
qty1 int,
locationtype text,
nonsellableqty int,
qty2 int,
qty3 int,
qty3 int,
refitem text,
reflocation int,
rtvqty int,
soh int,
tsfexpectedqty int,
tsfreservedqty int,
PRIMARY KEY (item, market, location) ) WITH CLUSTERING ORDER BY (market ASC, location ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'SizeTieredCompactionStrategy'} ;
Query is select * from table1 where item='1001';
With single partition query, response time is around 300 to 400 ms.
I enabled the tracing to check the system traces and below is the observation
session_id | event_id | activity | source | source_elapsed | source_port | thread
--------------------------------------+--------------------------------------+---------------------------------------------------------------------------+---------------+----------------+-------------+------------------------------------------------
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f060-7756-11ec-bf3a-35d288fe166a | reading digest from /host3 | host1 | 209 | null | Native-Transport-Requests-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f061-7756-11ec-bf3a-35d288fe166a | Executing single-partition query on roles | host1 | 241 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f062-7756-11ec-bf3a-35d288fe166a | Acquiring sstable references | host1 | 269 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f063-7756-11ec-bf3a-35d288fe166a | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | host1 | 294 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f064-7756-11ec-bf3a-35d288fe166a | Key cache hit for sstable 1 | host1 | 329 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f065-7756-11ec-bf3a-35d288fe166a | Sending READ message to /host3 | host1 | 330 | null | MessagingService-Outgoing-/host3-Small
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f066-7756-11ec-bf3a-35d288fe166a | Key cache hit for sstable 2 | host1 | 373 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f067-7756-11ec-bf3a-35d288fe166a | Merged data from memtables and 2 sstables | host1 | 425 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb20f068-7756-11ec-bf3a-35d288fe166a | Read 1 live rows and 0 tombstone cells | host1 | 446 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211770-7756-11ec-a029-857a112314fb | READ message received from /host1 | host3 | 2 | null | MessagingService-Incoming-/host1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211770-7756-11ec-bf3a-35d288fe166a | REQUEST_RESPONSE message received from /host3 | host1 | 1247 | null | MessagingService-Incoming-/host3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211771-7756-11ec-a029-857a112314fb | Executing single-partition query on roles | host2 | 71 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211771-7756-11ec-bf3a-35d288fe166a | Processing response from /host2 | host1 | 1316 | null | RequestResponseStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211772-7756-11ec-a029-857a112314fb | Acquiring sstable references | host2 | 96 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211772-7756-11ec-bf3a-35d288fe166a | reading digest from /host4 | host1 | 1426 | null | Native-Transport-Requests-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211773-7756-11ec-a029-857a112314fb | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | host2 | 117 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211773-7756-11ec-bf3a-35d288fe166a | Executing single-partition query on table1 | host1 | 1441 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211774-7756-11ec-a029-857a112314fb | Key cache hit for sstable 1 | host2 | 144 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211774-7756-11ec-bf3a-35d288fe166a | Acquiring sstable references | host1 | 1449 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211775-7756-11ec-a029-857a112314fb | Key cache hit for sstable 2 | host2 | 172 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211775-7756-11ec-bf3a-35d288fe166a | speculating read retry on /host2 | host1 | 1453 | null | Native-Transport-Requests-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211776-7756-11ec-a029-857a112314fb | Merged data from memtables and 2 sstables | host2 | 225 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211776-7756-11ec-bf3a-35d288fe166a | Key cache hit for sstable 2 | host1 | 1467 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211777-7756-11ec-a029-857a112314fb | Read 1 live rows and 0 tombstone cells | host2 | 244 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211777-7756-11ec-bf3a-35d288fe166a | Key cache hit for sstable 1 | host1 | 1481 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211778-7756-11ec-a029-857a112314fb | Enqueuing response to /host1 | host2 | 250 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211778-7756-11ec-bf3a-35d288fe166a | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | host1 | 1490 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211779-7756-11ec-a029-857a112314fb | Sending REQUEST_RESPONSE message to /host1 | host2 | 375 | null | MessagingService-Outgoing-/host1-Small
eb20c950-7756-11ec-bf3a-35d288fe166a | eb211779-7756-11ec-bf3a-35d288fe166a | Sending READ message to /host2 | host1 | 1495 | null | MessagingService-Outgoing-/host2-Small
eb20c950-7756-11ec-bf3a-35d288fe166a | eb21177a-7756-11ec-bf3a-35d288fe166a | Sending READ message to /host4 | host1 | 1501 | null | MessagingService-Outgoing-/host4-Small
eb20c950-7756-11ec-bf3a-35d288fe166a | eb21177b-7756-11ec-bf3a-35d288fe166a | Merged data from memtables and 2 sstables | host1 | 1603 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb21177c-7756-11ec-bf3a-35d288fe166a | Read 2 live rows and 2 tombstone cells | host1 | 1622 | null | ReadStage-3
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e80-7756-11ec-a029-857a112314fb | READ message received from /host1 | host2 | 1 | null | MessagingService-Incoming-/host1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e80-7756-11ec-a9d1-87a6d519c8de | READ message received from /host1 | host4 | 6 | null | MessagingService-Incoming-/host1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e80-7756-11ec-bf3a-35d288fe166a | REQUEST_RESPONSE message received from /host4 | host1 | 2371 | null | MessagingService-Incoming-/host4
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e81-7756-11ec-a029-857a112314fb | Executing single-partition query on table1 | host2 | 100 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e81-7756-11ec-bf3a-35d288fe166a | Processing response from /host4 | host1 | 2437 | null | RequestResponseStage-2
eb20c950-7756-11ec-bf3a-35d288fe166a | eb213e82-7756-11ec-a029-857a112314fb | Acquiring sstable references | host2 | 117 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216590-7756-11ec-a029-857a112314fb | Partition index with 0 entries found for sstable 2 | host2 | 665 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216590-7756-11ec-a9d1-87a6d519c8de | Executing single-partition query on table1 | host4 | 137 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216590-7756-11ec-bf3a-35d288fe166a | REQUEST_RESPONSE message received from /host2 | host1 | 39 | null | MessagingService-Incoming-/host2
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216591-7756-11ec-a029-857a112314fb | Key cache hit for sstable 1 | host2 | 1063 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216591-7756-11ec-a9d1-87a6d519c8de | Acquiring sstable references | host4 | 171 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216591-7756-11ec-bf3a-35d288fe166a | Processing response from /host2 | host1 | 129 | null | RequestResponseStage-2
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216592-7756-11ec-a029-857a112314fb | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | host2 | 1087 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216592-7756-11ec-a9d1-87a6d519c8de | Key cache hit for sstable 2 | host4 | 198 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216592-7756-11ec-bf3a-35d288fe166a | Initiating read-repair | host1 | 174 | null | RequestResponseStage-2
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216593-7756-11ec-a029-857a112314fb | Merged data from memtables and 2 sstables | host2 | 1354 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216593-7756-11ec-a9d1-87a6d519c8de | Key cache hit for sstable 1 | host4 | 228 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216594-7756-11ec-a029-857a112314fb | Read 2 live rows and 2 tombstone cells | host2 | 1386 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216594-7756-11ec-a9d1-87a6d519c8de | Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | host4 | 238 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216595-7756-11ec-a029-857a112314fb | Enqueuing response to /host1 | host2 | 1394 | null | ReadStage-1
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216595-7756-11ec-a9d1-87a6d519c8de | Merged data from memtables and 2 sstables | host4 | 391 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216596-7756-11ec-a029-857a112314fb | Sending REQUEST_RESPONSE message to /host1 | host2 | 1431 | null | MessagingService-Outgoing-/host1-Small
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216596-7756-11ec-a9d1-87a6d519c8de | Read 2 live rows and 2 tombstone cells | host4 | 415 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216597-7756-11ec-a9d1-87a6d519c8de | Enqueuing response to /host1 | host4 | 423 | null | ReadStage-8
eb20c950-7756-11ec-bf3a-35d288fe166a | eb216598-7756-11ec-a9d1-87a6d519c8de | Sending REQUEST_RESPONSE message to /host1 | host4 | 499 | null | MessagingService-Outgoing-/host1-Small
For the above mentioned item id(1001), we have 2000 records.
when we specify the RF as 3, why query is executed against more than 3 hosts? we are expecting query response in 20 ms. Please let me know any thing wrong with configuration? or how to tweak the setting to achieve sub ms response time.
Upvotes: 1
Views: 136
Reputation: 27304
Depending on the driver settings - if the co-ordinator chosen is not one of the nodes with a replica, that would require a minimum of 2 other nodes being hit, so that can result in 3 nodes being within the trace, 1 co-ordinator, 2 replicas.
There are 2 other things however going on which cause the node count to be more than just the 3.
Initiating read-repair
If a disgest mismatch between the 2 replicas is detected, it will synchronously repair the data between all replicas, which itself will cause a delay in the response. This would result in the scenario you see of 4 nodes, (co-ordinator + 3 replicas) being contacted.
speculating read retry on /host2
The query has taken sufficiently long, that the query was re-issued from the driver. This again can cause it to send the query to another co-ordinator, which can then involve different nodes in answering the query.
The last part of the question is about performance - which is trickier. Ensuring the data is repaired and avoiding digest mismatches would of course help, but within the trace (which is perhaps just test data), we can see that 2 live records and 2 tombstones exist. The existence of tombstones will not help performance, and from the look of it, data is being read from multiple sstables. More information about the usage pattern would be needed to understand on options to tune, but if there is a high level of deletion / updates occuring, the use of STCS is going to cause issues.
Upvotes: 2