Chanti
Chanti

Reputation: 575

Join two tables based on nearby timestamps

Table1 (1422 rows)

 sn1     |       dateee        | shift | linee
---------+---------------------+-------+-------
 8419404 | 2015-02-27 09:45:50 | D     |     2
 8419383 | 2015-02-27 09:46:10 | D     |     2
 8419410 | 2015-02-27 09:46:40 | D     |     2
 8419385 | 2015-02-27 09:50:40 | D     |     2
 8419412 | 2015-02-27 09:50:50 | D     |     2
 8419390 | 2015-02-27 09:51:30 | D     |     2
 8419414 | 2015-02-27 09:52:00 | D     |     2
 8419387 | 2015-02-27 09:52:20 | D     |     2
 8419416 | 2015-02-27 09:52:50 | D     |     2
 8419394 | 2015-02-27 09:57:10 | D     |     2

Table2 (824 rows)

id    | id2 |        timee
------+-----+---------------------
 1302 |     | 2015-02-27 09:46:11
 1303 |     | 2015-02-27 09:46:36
 1304 |     | 2015-02-27 09:50:37
 1305 |     | 2015-02-27 09:51:06
 1306 |     | 2015-02-27 09:51:31
 1307 |     | 2015-02-27 09:51:55
 1308 |     | 2015-02-27 09:52:20
 1309 |     | 2015-02-27 09:52:45
 1310 |     | 2015-02-27 09:57:05

I want to join these two tables (using left join) with nearby timestamps. table1 is first step and table2 is second step in production process.

In my desired table dateee(from table1) and timee(from table2) should be very nearby. I want to correlate sn1 and id based on the nearby timestamp.

Upvotes: 2

Views: 7285

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656706

"Nearby" is rather fuzzy.
To join table2 where timee lies within 10 seconds of dateee in the future:

SELECT *
FROM   table1 t1
LEFT   JOIN table2 t2 ON t2.timee BETWEEN t1.dateee
                                      AND t1.dateee + interval '10 sec';
  • The LEFT JOIN keeps rows from table1 in the result, even without match in table2.

  • There can be multiple matches, so that each row from the base tables can be returned multiple times in various combinations.

Alternative

To join each rows in table1 to the row with with the next higher timestamp. Exactly one row in the result per row on table1:

SELECT *
FROM   table1 t1
LEFT   JOIN LATERAL (
   SELECT *
   FROM   table2 t2
   WHERE  t2.timee >= t1.dateee
   ORDER  BY t2.timee
   LIMIT  1
   ) ON TRUE;

An index on (timee) is essential for performance.

Upvotes: 8

mrkovec
mrkovec

Reputation: 376

maybe something like this will work

select * 
from 
    Table1 t1, 
    Table2 t2,
    (select 
        a.*, 
        row_number() over (partition by a.sn1 order by a.mt) rn
    from
        (select 
            q.sn1, 
            w.id id, 
            min(abs(q.dateee - w.timee)) mt
        from 
            Table1 q, 
            Table2 w 
        group by q.sn1, w.id
        ) a
    ) mysort
where 
mysort.rn = 1 and mysort.sn1 = t1.sn1 and mysort.id=t2.id

but beware large data sets

Upvotes: 0

Related Questions