BAE
BAE

Reputation: 8936

cassandra counter table design: which design is better

There are three keys and one counter:

A: 300 values [A1->A300]

B: 10 values [B1->B10]

C: 400 values [C1->C400]

counter count

There are two designs:

Design 1, primary key is ((A, C), B), (A, C) is partition key, B is cluster key

query: select count(*) from XXX where A = A1 and C in ( C91, C92, ..., C200) and B = B1

Design 2, primary key is (A, B, C), A is partition key, B and C are cluster keys

query: select count(*) from XXX where A = A1 and B = B1 and C <= C200 and C> C90

I have two questions:

Q1, As for the queries similar to select count(*) from XXX where A = A1 and B = B1 and C <= C200 and C> C90 (A1, B1, C200 and C90 are random), in the above two primary key design, which is better?

Q2, how many values in IN () can be supported?

Any comments welcomed. Thanks.

Upvotes: 0

Views: 85

Answers (2)

xmas79
xmas79

Reputation: 5180

I think design 2 is better.

Assuming in design 1 you're going to issue these kind of queries:

select count(*) from XXX where A = A1 and C in ( C91, C92, ..., C200) and B = B1

you are going to put a lot of stress on your coordinator. The use of IN clause at partition key level is discouraged unless you do it for a few values. This is because the coordinator will have wait the results from every partition before returning the rows to the client. And if one node fails, everything will fail, and you'll need to retry the whole thing. A nice explanation of this behaviour can be found here.

On the contrary, assuming in design 2 you'll run these kind of queries:

select count(*) from XXX where A = A1 and B = B1 and C <= C200 and C> C90

you will scan only one partition at time, and this is fine because C* will perform a range scan. A perfectly safe scenario, unless your partition gets very wide.

Upvotes: 0

Ashraful Islam
Ashraful Islam

Reputation: 12830

Q1. Design 1 is better
Because In your case :
Design 1 max value of a single partition can be only the number of c values
but For
Design 2 max value of a single partition can be the number of b * c values

So the search space for Design 1 is very small.It is very scale-able

Q2. Max value supported in IN Query is 65535
If you try to send number if value greater than this value, it will throw IllegalArgumentException("Too many values for IN clause, the maximum allowed is 65535");

Upvotes: 1

Related Questions