Reputation: 6261
I'm planning storing log records in Cassandra, and primarily need to be able query them by date range. My primary key is a time based UUID. I've seen lots of examples that allow filtering by date range in addition to some key, but is there any way to efficiently query just by a date range, without such a key, and without using an Ordered Partitioner?
Upvotes: 2
Views: 481
Reputation: 4667
No, the partition key (first element of the primary key) allows queries to be routed to the appropriate node and not scan the whole cluster. Yet if the partition is still the same then data won't be distributed over the cluster and a few nodes will get the workload. You could create a table like:
create table log (
log_type text,
day text, -- In format YYYY-MM-DD for instance
id timeuuid,
message text,
primary key ((log_type, day), id)
)
Then from your date range, you can determine the day values and the possible partition keys. Add a condition on timeuiid to finish:
select * from log where log_type='xxx' and day='2014-02-19' and dateOf(id)>? and dateOf(id)<?
select * from log where log_type='xxx' and day='2014-02-20' and dateOf(id)>? and dateOf(id)<?
select * from log where log_type='xxx' and day='2014-02-21' and dateOf(id)>? and dateOf(id)<?
Another option could be to use the ALLOW FILTERING
clause, but this will do a full cluster scan. So it's a good idea only if you know that at least 90% of partition keys will contain interesting data.
select * from log where dateOf(id)>? and dateOf(id)<? allow filtering
Upvotes: 2