Reputation: 1313
I'm writing a Java application with Cassandra DB. I'm making a request with plenty (more than 100,000) parameters in my 'IN' clause :
SELECT country, gender FROM persons WHERE person_id IN (1,7,18, 34,...,)
But putting some many parameters in "IN" looks bad I think.
I can also make plenty of request like this (once again more than 100,000 iterations) :
for (Integer id : ids) {
ResultSet res = session.execute(preparedStatement(id));
//processing with data from Cassandra
}
Doesn't better either, too long.
Is there any API, pattern to follow in my case ?
Thank you
Upvotes: 1
Views: 58
Reputation: 87174
If the person_id
is partition key (as seen from query), then using IN will lead to a lot of problems as it will overload coordinator node that will need to collect results from other nodes.
In this case the most effective way is to fire individual requests, but perform them via executeAsync
, so they will be sent to different nodes. In this case you'll need to control how many requests you have sent, for example via counting semaphore, plus you may need to time connection pooling parameters that control number of in-flight requests: https://docs.datastax.com/en/developer/java-driver/3.6/manual/pooling/
Upvotes: 3