Reputation: 3262
I am trying to understand the Cassandra concurrent read and writes. I come across the property called
concurrent_reads (Defaults are 8)
A good rule of thumb is 4 concurrent_reads per processor core. May increase the value for systems with fast I/O storage
So as per the definition, Correct me If am wrong, 4 threads can access the database concurrently. So let's say I am trying to run the following query,
SELECT max(column1) from 'testtable' WHERE duration = 'month';
I am just trying to execute this query, What will be the use of concurrent read in executing this query?
Upvotes: 1
Views: 4431
Reputation: 16400
Thats how many active reads can run at a single time per host. This is viewable if you type nodetool tpstats
under the read stage. If the active is at pegged at the number of concurrent readers and you have a pending queue it may be worth trying to increase this. Its pretty normal for people to have this at ~128 when using decent sized heaps and SSDs. This is very hardware dependent so defaults are conservative.
Keep in mind that the activity on this thread is very fast, usually measured in sub ms but assuming they take 1ms even with only 4, given little's law you have a maximum of 4000 (local) reads per second per node max (1000/1 * 4), with RF=3 and quorum consistency that means your doing a minimum of 2 reads per request so can divide in 2 to think of a theoretical (real life is ickier) max throughput.
The aggregation functions (ie max
) are processed on the coordinator, after fetching the data of the replicas (each doing a local read and sending response) and are not directly impacted by the concurrent reads since handled in the native transport and request response stages.
Upvotes: 5
Reputation: 2379
From cassandra 2.2 onward, the standard aggregate functions min, max, avg, sum, count
are built-in. So, I don't think concurrent_reads
will have any effect on your query.
Upvotes: 1