Reputation: 1031
I have a table called Price in MYSQL which looks like this :
+---------+-------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------+-------------+------+-----+-------------------+-----------------------------+
| Current | float(20,3) | YES | | NULL | |
| Time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+---------+-------------+------+-----+-------------------+-----------------------------+
My application requires me to sum and retrieve results from the last 1 hour, 2 hours up to the last week from now. I am trying to move to Cassandra and wanted to make a suitable model for my data. Currently i have built a table in Cassandra which looks something like this :
CREATE TABLE IF NOT EXISTS HAS.Price (
ID INT,
Current float,
Time timestamp,
Time_uuid timeuuid,
PRIMARY KEY (ID, Time_uuid)
);
This is not logical as it just creates one big table and i dont think this will distribute data to other nodes. I am using a fixed id of 1 here. I believe in my case the logical partition key to choose would be "hour" so for example i can sum all the current values from last hour, last 2 hours and so on. In this case i am referring to this post . If i create hour as a partition key for example all the data for lets say the 15th hour of the day will go in this row
2015-08-06 15:00:00
and the data for the next hour will go to 2015-08-06 16:00:00
. However lets say the current time is 2015-08-06 16:12:43 and i want to select records from last hour how will my query look like because part of the data is in 2015-08-06 15:00:00 which will have a different primary key
Upvotes: 1
Views: 133
Reputation: 938
This is not logical as it just creates one big table and i dont think this will distribute data to other nodes.
Yes, this won't distribute data across you nodes.
Here what I think solution should be
CREATE TABLE IF NOT EXISTS HAS.Price (
Time_uuid timeuuid,
Current float,
PRIMARY KEY (Time_uuid)
);
Then simply find start hour time_uuid and end hour time_uuid and write query like
`SELECT * FROM HAS.Price WHERE time_uuid>=cdb36860-4444-11e5-8080-808080808080 AND time_uuid<=f784b8ef-450d-11e5-7f7f-7f7f7f7f7f7f`
Upvotes: 1
Reputation: 339
Try the following option. ( I have correct the answer)
Design for your queries. Here, possible queries I could see other than upto minute
Get sum for day
Get sum for hour
Get sum for last hour (any time on the hour)
CREATE TABLE mykeyspace.price (
day text,
hour text,
inserttime timeuuid,
current float,
PRIMARY KEY ((day, hour), inserttime)
) WITH CLUSTERING ORDER BY (inserttime DESC)
Make 2 insert for every transaction like below
insert into price (day, hour , inserttime , current ) VALUES ('20150813','',now(),2.00)
insert into price (day, hour , inserttime , current ) VALUES ('',’ 2015081317',now(),2.00)
Where
Select Query to get last hour at any minute: Use minTimeuuid and maxTimeuuid
select day,hour,dateOf(inserttime) from price where day = 0 and hour IN ( 2015081317, 2015081316) and inserttime > maxTimeuuid('2015-08-13 16:20:00-0500') and inserttime < minTimeuuid('2015-08-13 17:20:00-0500');
Note: Range query is not allowed on a partition key, although documentation says you could use token function but the results are not predictable.
Upvotes: 1