Reputation: 24590
I am already able to get the last row of time-series table as:
SELECT * from myapp.locations WHERE organization_id=1 and user_id=15 and date='2017-2-22' ORDER BY unix_time DESC LIMIT 1;
That works fine, however, I am wondering about performance and overhead of executing ORDER BY
as rows are already sorted, I just use it to get the last row, is it an overhead in my case?
If I don't use ORDER BY
, I will always get the first row in the table, so, I though I might be able to use INSERT
in another way, ex: insert always in the beginning instead of end of table?
Any advice? shall I use ORDER BY
without worries about performance?
Upvotes: 1
Views: 492
Reputation: 5180
If your query pattern for that table is always ORDER BY unix_time DESC
then you are in a reverse order time-series scenario, and I can say that your model is inaccurate (not wrong).
There's no reason not to sort the records in reverse order by adding a WITH CLUSTERING ORDER BY unix_time DESC
in the table definition, and in my opinion the ORDER BY unix_time DESC
will perform at most on par with something explicitly meant for these use cases (well, I think it will perform worse).
Upvotes: 1
Reputation: 12840
Just define your clustering key order to DESC
Like the below schema :
CREATE TABLE locations (
organization_id int,
user_id int,
date text,
unix_time bigint,
lat double,
long double,
PRIMARY KEY ((organization_id, user_id, date), unix_time)
) WITH CLUSTERING ORDER BY (unix_time DESC);
So by default your data will sorted by unix_time desc, you don't need to specify in query
Now you can just use the below query to get the last row :
SELECT * from myapp.locations WHERE organization_id = 1 and user_id = 15 and date = '2017-2-22' LIMIT 1;
Upvotes: 1