Reputation: 1814
I'm trying to insert 50000 records into a five node cassandra cluster. I'm using executeAsync so as to increase the performance(reduce insertion time from Application side). I tried Batchstatement with several batch sizes, but everytime I got the following exception.
Exception in thread "main" com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:54)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:259)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:175)
at
I inserted data i.e. 10000,20000 upto 40000 records without any issue. The following is the java code I wrote.
for (batchNumber = 1; batchNumber <= batches; batchNumber++) {
BatchStatement batch = new BatchStatement();
for (record = 1; record <= batchSize; record++) {
batch.add(ps.bind(query));
}
futures.add(session.executeAsync(batch));
}
for (ResultSetFuture future : futures) {
resultSet = future.getUninterruptibly();
}
where ps is the prepared statement, batches is number of batches, and batchSize is the number of records in a batch.
I'm unable to understand the root cause of the issue. I thought that some of the nodes were down and when I checked all are running normally.
How should I debug the exception?
Upvotes: 1
Views: 4801
Reputation: 5180
I see a few mistakes:
Let's restart.
BATCH
overloads the coordinator node. The larger the batch (both in term of kb or number of statements), the greater the overload on the coordinator.BATCH
works. One node is chosen to coordinate all the statements, and such node will be responsible for all the statements. Usually the coordinator is chosen based on the first statement, and if your statements hit multiple nodes, your coordinator will need to coordinate things belonging to different nodes as well. Instead, if you'd fire multiple separate async queries, every node would be responsible for their statements only. You'd be spreading the overload on all your cluster nodes instead of hammering on one node.new BoundStatement(ps).bind(xxxx)
statement. That's an easy fix anyway.future
s to the list, and will be eventually be killed because an OOM error. Moreover, you're not giving your cluster the possibility to actually ingest all the data you're firing at it, because you can fire data way faster than your cluster can ingest. What you need to do is limit the number of futures in the list. Keep it to some value at most (eg say 1000). To perform such task you need to move your final loop with .getUninterruptibly
inside the loop. This way, you throttle down the ingestion rate and will see a decreased timeout exceptions count. And depending on the application, decreased timeout exceptions means less retries, hence less queries, less overhead, better response times etc....getUninterruptibly
on the Future
's list, but what you should keep in mind that when your cluster is
overloaded, you will get timeouts. At this point, you should catch the exception and deal with it, be it a retry, be it a re-throw, be it whatever else. I suggest you to design you model around idempotent queries, so I can retry the failed queries until they succeed without worrying about retry consequences (which can happen at driver level too!).Hope that helps.
Upvotes: 5
Reputation: 38807
That's not what BATCH
is for.
When you add multiple statements to a batch, Cassandra will try to apply them atomically.
Either all of them will succeed or none of them will, and they all have to complete within a single query timeout.
Also, if you make more requests than can be handled simultaneously, they're going to go into a queue, and time waiting in the queue contributes to the timeout.
To get them all through without timeout, use individual statements and limit the number in flight at any one time.
Alternatively, use a COPY
command to load the data from a CSV.
Upvotes: 1