Lukasz Kujawa
Lukasz Kujawa

Reputation: 3096

How to retrieve a date range from cassandra

I have a very simple table to store collection of IDs by a date rage

CREATE TABLE schedule_range (
  start_date timestamp,
  end_date timestamp,
  schedules set<text>,
  PRIMARY KEY ((start_date, end_date)));

I was hoping to be able to query it by a date range

SELECT *
FROM schedule_range
WHERE start_date >= 'xxx'
AND end_date < 'yyy'

Unfortunately it doesn't work this way. I've tried few different approaches and it always fail for a different reason.

How should I store IDs to be able to get them all by a date range?

Upvotes: 1

Views: 4126

Answers (2)

gasparms
gasparms

Reputation: 3354

In cassandra you only can use >, < operators with last field of primary key, in your case 'end_date'. For previous fields you must use equal operator. If you just considerate that schema maybe you could use other choices.

One approximation is use Apache Spark. There is some projects that built an abstraction layer in Spark over Cassandra and let you make operations in cassandra such as joins, any filter, groups by ...

Check this projects:

Upvotes: 2

catpaws
catpaws

Reputation: 2283

Using this table with a query that somewhat resembles yours works because 1) it doesn't use the conditional on the partition key start_date. Only EQ and IN relation are supported on the partition key. 2) The greater-than and less-than comparison on the clustering column is restricted to filters that select a contiguous ordering of rows. Filtering by the clustering column--2nd component in the compound key--id, does the latter.

create table schedule_range2(start_date timestamp, end_date timestamp, id int, schedules set<text>, primary key (start_date, id, end_date));
insert into schedule_range2 (start_date, id, end_date, schedules) VALUES ('2014-02-03 04:05', 1, '2014-02-04 04:00', {'event1', 'event2'});
insert into schedule_range2 (start_date, id, end_date, schedules) VALUES ('2014-02-05 04:05', 1, '2014-02-06 04:00', {'event3', 'event4'});
select * from schedule_range2 where id=1 and end_date >='2014-02-04 04:00' and end_date < '2014-02-06 04:00' ALLOW FILTERING;

Upvotes: 1

Related Questions