Alexis Seigneurin
Alexis Seigneurin

Reputation: 1483

Cassandra read timeouts on AWS

TL;DR: I'm using Cassandra. I'm making tests to see if it will handle the load but I get lots of timeouts when reading the data.

com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)

I have set up a Cassandra cluster on Amazon AWS: 8 m4.xlarge instances with 2 EBS drives - type 'gp2' - of 100 GB each (commit log on one drive, the rest of the data on the other one). The instances are in the same availability zone, in a VPC. I'm using the stock version of Apache Cassandra 3.7 with no specific tuning of the servers or of Cassandra itself.

I have loaded 1 Billion records. Each of them has about 30 fields. Primary key is made of 2 partition keys and one clustering column. I have about 10 records per partition key. Replication factor is 3. Each of the 8 nodes stores about 40 GB of data after compaction.

My test consists in making 1000 queries on random keys with a basic Scala application using the Datastax Cassandra driver. The WHERE clause contains the partition key and I read all the records, i.e. the WHERE clause does not include the clustering column.

When the queries are sequential, all the queries return expected results and the average response time is 74 ms.

When I use async queries, make all the queries at once and call get() on the Futures, I get many timeouts after 5 seconds (between 25% and 75% of the queries fail).

I assumed the EBS drives might be throttled and I tried with a different cluster: 3 nodes of type i2.xlarge with data stored on the ephemeral drives.

Notice that, during my tests, the compaction had stopped doing its job. I did not see the Garbage Collector kicking in during the queries.

Any idea why the queries are generating time outs?

Upvotes: 0

Views: 437

Answers (1)

doanduyhai
doanduyhai

Reputation: 8812

When I use async queries, make all the queries at once and call get() on the Futures, I get many timeouts after 5 seconds (between 25% and 75% of the queries fail).

Did you throttle your async queries ? How many select did you send to the cluster, asynchronously ?

Upvotes: 2

Related Questions