spcial
spcial

Reputation: 1579

SQL Join on timestamps where you only take the first/oldest result

Let's say I have two tables:

Table #1 t1

ID | Col1 | MeasureTime         | Parameter_ID
----------------------------------------------
1  | abc  | 2020-11-11 07:00:00 | 1
2  | abc  | 2020-11-11 08:00:00 | 1

Table #2 t2

ID | Parameter_ID | Col2 | ValidFrom
----------------------------------------------
1  | 1            | xyz  | 2020-11-11 06:30:00
2  | 1            | def  | 2020-11-11 07:30:00

I want to join both tables using the Parameter_ID and I want to have a 1:1 relationship. The ValidFrom from t2 is the time that this parameter becomes valid. But unfortunately there is no ValidTo but rather I have to check that I join with the first Parameter_ID from t2 where the ValidFrom is valid.

So I want to have this result:

Table #3 t3

ID | Col1 | MeasureTime         | Parameter_ID | ValidFrom | Col2
-----------------------------------------------------------------
1  | abc  | 2020-11-11 07:00:00 | 1            | 06:30:00  | xyz
2  | abc  | 2020-11-11 08:00:00 | 1            | 07:30:00  | def

but if I would do a simple:

SELECT * 
FROM t1 
JOIN t2 ON t1.PARAMETER_ID = t2.PARAMETER_ID AND t1.Measuretime >= t2.ValidFrom

I would get:

Table #3 t3

ID | Col1 | MeasureTime         | Parameter_ID | ValidFrom | Col2
-----------------------------------------------------------------
1  | abc  | 2020-11-11 07:00:00 | 1            | 06:30:00  | xyz
2  | abc  | 2020-11-11 08:00:00 | 1            | 06:30:00  | xyz
3  | abc  | 2020-11-11 08:00:00 | 1            | 07:30:00  | def

where I actually don't want to have ID 2.

Hope my problem becomes clear. I could think of something like "if there are multiple results from t2, order by ValidFrom ASC and take only the first". But unfortunately I'm not really sure how to do this with SQL. Maybe there is also a more elegant solution to this issue?

Upvotes: 1

Views: 210

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270613

I would suggest generated a valid_to using lead(). Then use a join:

select t1.*, t2.valid_from, t2.col2
from table1 t1 join
     (select t2.*,
             lead(valid_from) over (partition by parameter_id order by valid_from) as valid_to
      from t2
     ) t2
     on t1.parameter_id = t2.parameter_id and
        t1.MeasureTime >= t2.valid_from and
        (t1.MeasureTime < t2.valid_to or t2.valid_to is null);

Although you way that you are willing to just line up the values in the two tables, it sounds like the correct solution is to get the row from table2 whose date range includes the table1 date.

Upvotes: 1

Popeye
Popeye

Reputation: 35920

You can expand your original query using ROW_NUMBER analytical function as follows:

SELECT * FROM
(SELECT T1.*, T2.VALID_FROM, T2.COL2,
        ROW_NUMBER() 
          OVER (PARTITION BY T1.ID ORDER BY T2.VALID_FROM DESC NULLS LAST) AS RN
  FROM T1 JOIN T2 ON T1.PARAMETER_ID = T2.PARAMETER_ID
                 AND T1.MEASURETIME >= T2.VALIDFROM)
WHERE RN = 1

Upvotes: 1

Zaynul Abadin Tuhin
Zaynul Abadin Tuhin

Reputation: 31991

you can use row_number()

select a.* from 

(SELECT *, row_number()over(partition by Col1,MeasureTime order by MeasureTime) rn
FROM t1 
JOIN t2 ON t1.PARAMETER_ID = t2.PARAMETER_ID AND t1.Measuretime >= t2.ValidFrom
) a were a.rn=1

Upvotes: 1

Related Questions