Cassandra concurrent read and write

Question

I am trying to understand the Cassandra concurrent read and writes. I come across the property called

concurrent_reads (Defaults are 8)

A good rule of thumb is 4 concurrent_reads per processor core. May increase the value for systems with fast I/O storage

So as per the definition, Correct me If am wrong, 4 threads can access the database concurrently. So let's say I am trying to run the following query,

SELECT max(column1) from 'testtable' WHERE duration = 'month';

I am just trying to execute this query, What will be the use of concurrent read in executing this query?

Chris Lohfink · Accepted Answer

Thats how many active reads can run at a single time per host. This is viewable if you type nodetool tpstats under the read stage. If the active is at pegged at the number of concurrent readers and you have a pending queue it may be worth trying to increase this. Its pretty normal for people to have this at ~128 when using decent sized heaps and SSDs. This is very hardware dependent so defaults are conservative.

Keep in mind that the activity on this thread is very fast, usually measured in sub ms but assuming they take 1ms even with only 4, given little's law you have a maximum of 4000 (local) reads per second per node max (1000/1 * 4), with RF=3 and quorum consistency that means your doing a minimum of 2 reads per request so can divide in 2 to think of a theoretical (real life is ickier) max throughput.

The aggregation functions (ie max) are processed on the coordinator, after fetching the data of the replicas (each doing a local read and sending response) and are not directly impacted by the concurrent reads since handled in the native transport and request response stages.

Cassandra concurrent read and write

Answers (2)

Related Questions