konewka
konewka

Reputation: 650

SQL join category from timeline into timestamp table

Consider the following structure:

create table timestamps(id int, stamp timestamp);
insert into timestamps values
(1,'2017-10-01 10:05:01'),
(2,'2017-10-01 11:05:01'),
(3,'2017-10-01 12:05:01'),
(4,'2017-10-01 13:05:01');

create table category_timeline(begin timestamp,end timestamp, category varchar(100));
insert into category_timeline values
('2017-10-01 10:01:03','2017-10-01 12:01:03','Cat1'),
('2017-10-01 12:01:03','2017-10-01 12:42:43','Cat3'),
('2017-10-01 12:42:43','2017-10-01 14:01:03','Cat2');

Sqlfiddle of same: SQL Fiddle

I have two tables, one (timestamps) containing timestamps, and one (category_timeline) containing a timeline of categories, that is, we assume the records in category_timeline form a continuous non-overlapping timeline assigning a category to each time period.

I want to assign the categories to the timestamps table, resulting in:

| id |                stamp | category |
|----|----------------------|----------|
|  1 | 2017-10-01T10:05:01Z |     Cat1 |
|  2 | 2017-10-01T11:05:01Z |     Cat1 |
|  3 | 2017-10-01T12:05:01Z |     Cat3 |
|  4 | 2017-10-01T13:05:01Z |     Cat2 |

which is the result of the following query:

SELECT id, stamp, category FROM timestamps ts 
LEFT JOIN category_timeline tl 
ON   ts.stamp >= tl.begin 
AND  ts.stamp < tl.end

However, as soon as the tables get bigger, this operation seems to get exponentially slower, is there a better way to do this, using the assumption that any timestamp only falls within a unique period in the other table.

Upvotes: 0

Views: 92

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270401

I would suggest this approach:

SELECT ts.id, ts.stamp, 
       (SELECT tl.category
        FROM category_timeline tl 
        WHERE tl.end > ts.stamp
        ORDER BY tl.end ASC
        LIMIT 1
       ) as category
FROM timestamps ts ;

Be sure you have an index on category_timeline(end, category).

Upvotes: 1

Related Questions