Karthik S
Karthik S

Reputation: 857

Query columns based on datetime in Cassandra

We are trying to create/query information from a CF based on the following structure (e.g. a datetime, datetime, integer)

e.g.
03-22-2012 10.00, 03-22-2012 10.30 100
03-22-2012 10.30, 03-22-2012 11.00 50
03-22-2012 11.00, 03-22-2012 11.30 200

How do I model the above structure in Cassandra and perform the following queries via Hector.

select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 10.30
select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 11.00
select * from <CF> where datetime = 03-22-2012 (i.e. for the entire day)

Upvotes: 4

Views: 3377

Answers (2)

agentgonzo
agentgonzo

Reputation: 3643

Unfortunately there is no way to do this easily with just one column family in Cassandra. The problem is you are wanting cassandra to sort based on two different things: datetime1 and datetime2.

The obvious structure for this would be to have your Columns being Composite types of Composite(TimeUUID, TimeUUID, Integer). In this case, they will get sorted by datetime1, then datetime2, then integer.

But you will always get the ordering based on datetime1 and not on datetime2 (though if two entries have the same datetime1 then it will then order just those entries based on datetime2).

A possible workaround would be to have two column families with duplicate data (or indeed two rows for each logical row). One row where data is inserted (datetime1:datetime2:integer) and the other where it is inserted (datetime2:datetime1:integer). You can then do a multigetslice operation on these two rows and combine the data before handing it off to the caller:

final MultigetSliceQuery<String, Composite, String> query = HFactory.createMultigetSliceQuery(keyspace,
    StringSerializer.get(),
    CompositeSerializer.get(),
    StringSerializer.get());

query.setColumnFamily("myColumnFamily");
startQuery.setKeys("myRow.arrangedByDateTime1", "myRow.arrangedByDateTime2");
startQuery.setRange(new Composite(startTime), new Composite(endTime), false, Integer.MAX_VALUE);

final QueryResult<Rows<String,Composite,String>> queryResult = query.execute();
final Rows<String,Composite,String> rows = queryResult.get();

Upvotes: 0

Tyler Hobbs
Tyler Hobbs

Reputation: 6932

This is a great introduction to working with dates and times in Cassandra: Basic Time Series with Cassandra.

In short, use timestamps (or v1 UUIDs) as your column names and set the comparator to LongType (or TimeUUIDType) in order to get chronological sorting of the columns. It's then easy to get a slice of data between two points in time.

Your question isn't totally clear about this, but if you want to get all events that happened during a given range of time of day regardless of the date, then you will want to structure your data differently. In this case, column names may be CompositeType(LongType, AsciiType), where the first component is a normal timestamp mod 86400 (the number of seconds in a day), and the second component is the date or something else that changes over time, like a full timestamp. You would also want to break up the row in this case, perhaps dedicating a different row to each hour.

Upvotes: 2

Related Questions