Snowflake
Snowflake

Reputation: 3081

SQL statement to match dates that are the closest?

I have the following table, let's call it Names:

Name     Id    Date
Dirk     1     27-01-2015
Jan      2     31-01-2015
Thomas   3     21-02-2015

Next I have the another table called Consumption:

Id Date       Consumption
1  26-01-2015 30
1  01-01-2015 20
2  01-01-2015 10
2  05-05-2015 20

Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.

So the problem is as follows, I would like to match each Id from the Names table with the Consumption table provided that the difference between the dates are the lowest, so we have: Dirk consumes on 27-01-2015 about 30. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.

While I know how to join, I do not know how to code the difference part.

Thanks.

DBMS is Microsoft SQL Server 2012.

I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.

Upvotes: 1

Views: 288

Answers (1)

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

This is how you could it in SQL Server:

SELECT Id, Name, AVG(Consumption)
FROM (
    SELECT n.Id, Name, Consumption, 
           RANK() OVER (PARTITION BY n.Id 
                        ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
    FROM Names AS n
    INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name

Using RANK with PARTITION BY n.Id and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date])) you can locate all matching records per Id: all records with the smallest difference in days are going to have rnk = 1.

Then, using AVG in the outer query, you are calculating the average value of Consumption between all matching records.

SQL Fiddle Demo

Upvotes: 4

Related Questions