Partial Partition Key Querying With Per Partition Limit In Cassandra

Question

I have a table (let's call it T) set up with a PRIMARY KEY like the following:

PRIMARY KEY ((A, B), C, ....);

I want to query it like the following:

SELECT * FROM T WHERE A = ? and C <= ? PER PARTITION LIMIT 1 ALLOW FILTEIRNG;

(Note that C is a timstamp value. I am essentially asking for the most recent rows across all partitions whose first partition key belongs to my input).

This works with the allow filtering command, and it makes sense why I need it; I do not know beforehand the partition keys B, and I do not care - I want all of them. Therefore, it makes sense that Cassandra would need to scan the entire partition to give me the results, and it also makes sense why I would need to specify it to allow filtering for this to occur.

However, I have read that we should avoid 'ALLOW FILTERING' at all costs, as it can have a huge performance impact, especially in production environments. Indeed, I only use allow filtering very sparingly in my existing applications, and this is usually for one-off queries that calculate something of this nature.

My quesiton is this: is there a way to restructure this table or query to avoid filtering? I am thinking it is impossible, as I do not have knowledge of the keys that make up B beforehand, but I want to double check just to be sure. Thanks!

Partial Partition Key Querying With Per Partition Limit In Cassandra

Answers (1)

Related Questions