jeff17237
jeff17237

Reputation: 615

Avoiding filtering with a compound partition key in Cassandra

I am fairly new to Cassandra and currently have to following table in Cassandra:

CREATE TABLE time_data (
id int,
secondary_id int,
timestamp timestamp,
value bigint,
PRIMARY KEY ((id, secondary_id), timestamp)
);

The compound partition key (with secondary_id) is necessary in order to not violate max partition sizes.

The issue I am running in to is that I would like to complete the query SELECT * FROM time_data WHERE id = ?. Because the table has a compound partition key, this query requires filtering. I realize this is a querying a lot of data and partitions, but it is necessary for the application. For reference, id has relatively low cardinality and secondary_id has high cardinality.

What is the best way around this? Should I simply allow filtering on the query? Or is it better to create a secondary index like CREATE INDEX id_idx ON time_data (id)?

Upvotes: 0

Views: 430

Answers (1)

Mandraenke
Mandraenke

Reputation: 3266

You will need to specify full partition key on queries (ALLOW FILTERING will impact performance badly in most cases).

One way to go could be if you know all secondary_id (you could add a table to track them in necessary) and do the job in your application and query all (id, secondary_id) pairs and process them afterwards. This has the disadvantage of beeing more complex but the advantage that it can be done with async queries and in parallel so many nodes in your cluster participate in processing your task.

See also https://www.datastax.com/dev/blog/java-driver-async-queries

Upvotes: 1

Related Questions