user10679526
user10679526

Reputation: 109

PostgreSQL: Joining based on timestamp

I'm having two databases step1 and step2 containing a serial_number, an timestamp as well as some values. Each serial number is present in step1 and step2 several times, however step2 is optional. So for example, serial number "1" can be present in order of time like that: step1-step2-step1-step1-step2

For evaluation of the values, I want to join the two tables: However, I want to make sure, that just the directly following step2-row is joined to a step1-row, or in other words: no other step1 row with this serial_number is in between.

with step1 as (
    select * from (
        values 
        (1,'2020-01-01 12:00:00'::timestamp,45),
        (2,'2020-01-01 12:01:00'::timestamp,12),
        (1,'2020-01-01 13:00:00'::timestamp,68),
        (2,'2020-01-01 13:01:00'::timestamp,98),
        (3,'2020-01-01 13:02:00'::timestamp,2),
        (1,'2020-01-01 14:00:00'::timestamp,23)
    ) dev1(serial_number, ts, value1)
),
step2 as (
    select * from (
        values 
        (1,'2020-01-01 12:02:00'::timestamp,47),
        (2,'2020-01-01 12:03:00'::timestamp,16),
        (2,'2020-01-01 13:04:00'::timestamp,95),
        (3,'2020-01-01 13:05:00'::timestamp,4),
        (1,'2020-01-01 14:01:00'::timestamp,62)
    ) dev1(serial_number, ts, value2)
)
select * from step1
left join step2 on step1.serial_number=step2.serial_number

Expected result:

enter image description here

Any idea, how I can achieve this result?

Upvotes: 1

Views: 46

Answers (1)

jjanes
jjanes

Reputation: 44167

It looks like you want to use the LEAD window function to find the next time stamp in step1 within each serial number, then use a lateral join to bring in step2 where the step2.ts is between the original ts and the LEAD ts (or the LEAD is NULL, meaning there is no following ts).

with step1 as (
    select * from (
        values 
        (1,'2020-01-01 12:00:00'::timestamp,45),
        (2,'2020-01-01 12:01:00'::timestamp,12),
        (1,'2020-01-01 13:00:00'::timestamp,68),
        (2,'2020-01-01 13:01:00'::timestamp,98),
        (3,'2020-01-01 13:02:00'::timestamp,2),
        (1,'2020-01-01 14:00:00'::timestamp,23)
    ) dev1(serial_number, ts, value1)
),
step2 as (
    select * from (
        values 
        (1,'2020-01-01 12:02:00'::timestamp,47),
        (2,'2020-01-01 12:03:00'::timestamp,16),
        (2,'2020-01-01 13:04:00'::timestamp,95),
        (3,'2020-01-01 13:05:00'::timestamp,4),
        (1,'2020-01-01 14:01:00'::timestamp,62)
    ) dev1(serial_number, ts, value2)
),
step1_1 as (
    select *, lead(ts)  over (partition by serial_number order by ts) as lead_ts from step1
) 
select * from step1_1 
   left join lateral (
     select * from step2 where step1_1.serial_number=step2.serial_number and (step2.ts < lead_ts or lead_ts is null) and step2.ts > step1_1.ts order by ts limit 1
   ) foobar on true;

You could fold the definition of step1_1 into the definition of step1 itself, but for explanatory purposes I've left them separate.

Upvotes: 1

Related Questions