Reputation: 650
Consider the following structure:
create table timestamps(id int, stamp timestamp);
insert into timestamps values
(1,'2017-10-01 10:05:01'),
(2,'2017-10-01 11:05:01'),
(3,'2017-10-01 12:05:01'),
(4,'2017-10-01 13:05:01');
create table category_timeline(begin timestamp,end timestamp, category varchar(100));
insert into category_timeline values
('2017-10-01 10:01:03','2017-10-01 12:01:03','Cat1'),
('2017-10-01 12:01:03','2017-10-01 12:42:43','Cat3'),
('2017-10-01 12:42:43','2017-10-01 14:01:03','Cat2');
Sqlfiddle of same: SQL Fiddle
I have two tables, one (timestamps
) containing timestamps, and one (category_timeline
) containing a timeline of categories, that is, we assume the records in category_timeline
form a continuous non-overlapping timeline assigning a category to each time period.
I want to assign the categories to the timestamps
table, resulting in:
| id | stamp | category |
|----|----------------------|----------|
| 1 | 2017-10-01T10:05:01Z | Cat1 |
| 2 | 2017-10-01T11:05:01Z | Cat1 |
| 3 | 2017-10-01T12:05:01Z | Cat3 |
| 4 | 2017-10-01T13:05:01Z | Cat2 |
which is the result of the following query:
SELECT id, stamp, category FROM timestamps ts
LEFT JOIN category_timeline tl
ON ts.stamp >= tl.begin
AND ts.stamp < tl.end
However, as soon as the tables get bigger, this operation seems to get exponentially slower, is there a better way to do this, using the assumption that any timestamp only falls within a unique period in the other table.
Upvotes: 0
Views: 92
Reputation: 1270401
I would suggest this approach:
SELECT ts.id, ts.stamp,
(SELECT tl.category
FROM category_timeline tl
WHERE tl.end > ts.stamp
ORDER BY tl.end ASC
LIMIT 1
) as category
FROM timestamps ts ;
Be sure you have an index on category_timeline(end, category)
.
Upvotes: 1