AntonBoarf
AntonBoarf

Reputation: 1313

Java - Cassandra with plenty of parameters in "IN"

I'm writing a Java application with Cassandra DB. I'm making a request with plenty (more than 100,000) parameters in my 'IN' clause :

SELECT country, gender FROM persons WHERE person_id IN (1,7,18, 34,...,)

But putting some many parameters in "IN" looks bad I think.

I can also make plenty of request like this (once again more than 100,000 iterations) :

for (Integer id : ids) {
    ResultSet res = session.execute(preparedStatement(id));
    //processing with data from Cassandra
}

Doesn't better either, too long.

Is there any API, pattern to follow in my case ?

Thank you

Upvotes: 1

Views: 58

Answers (1)

Alex Ott
Alex Ott

Reputation: 87174

If the person_id is partition key (as seen from query), then using IN will lead to a lot of problems as it will overload coordinator node that will need to collect results from other nodes.

In this case the most effective way is to fire individual requests, but perform them via executeAsync, so they will be sent to different nodes. In this case you'll need to control how many requests you have sent, for example via counting semaphore, plus you may need to time connection pooling parameters that control number of in-flight requests: https://docs.datastax.com/en/developer/java-driver/3.6/manual/pooling/

Upvotes: 3

Related Questions