Cassandra filtering possible?

Question

I have a Cassandra table that is created like:

CREATE TABLE table(
  num int,
  part_key int,
  val1 int,
  val2 float,
  val3 text,
  ...,
  PRIMARY KEY((part_key), num)
);

part_key is 1 for every record, because I want to execute range queries and only got one server (I know that's not a good use case). num is the record number from 1 to 1.000.000. I can already run queries like

SELECT num, val43 FROM table WHERE part_key=1 and num<5000;

Is it possible to do some more filtering in Cassandra, like:

 ... AND val45>463;

I think it's not possible like that, but can somebody explain why? Right now I do this filtering in my code, but are there other possibilities?

I hope I did not miss a post that already explains this.

Thank you for your help!

ashic · Accepted Answer

Cassandra range queries are only possible on the last clustering column specified by the query. So, if your pk is (a,b,c,d), you can do

... where a=2, b=4, c>5
... where a=2, b>4

but not

... where a=2, c>5

This is because data is stored in partitions, index by partition key (the first key of the pk), and then sorted by each successive clustering key.

If you have exact values, you can add a secondary index to val 4 and then do

... and val4=34

but that's about it. And even then, you want to hit a partition before applying the index. Otherwise you'll get a cluster wide query that'll likely timeout.

The querying limitations are there due to the way cassandra stores data for fast insert and retrieval. All data in a partition is held together, so querying inside the partition client side is usually not a problem, unless you have very large wide rows (in which case, perhaps the schema should be reviewed).

Hope that helps.

Cassandra filtering possible?

Answers (1)

Related Questions