Reputation: 3081
I have the following table, let's call it Names
:
Name Id Date
Dirk 1 27-01-2015
Jan 2 31-01-2015
Thomas 3 21-02-2015
Next I have the another table called Consumption
:
Id Date Consumption
1 26-01-2015 30
1 01-01-2015 20
2 01-01-2015 10
2 05-05-2015 20
Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.
So the problem is as follows, I would like to match each Id from the Names
table with the Consumption
table provided that the difference between the dates
are the lowest, so we have: Dirk
consumes on 27-01-2015
about 30
. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.
While I know how to join, I do not know how to code the difference part.
Thanks.
DBMS is Microsoft SQL Server 2012.
I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.
Upvotes: 1
Views: 288
Reputation: 72175
This is how you could it in SQL Server:
SELECT Id, Name, AVG(Consumption)
FROM (
SELECT n.Id, Name, Consumption,
RANK() OVER (PARTITION BY n.Id
ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
FROM Names AS n
INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name
Using RANK
with PARTITION BY n.Id
and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))
you can locate all matching records per Id
: all records with the smallest difference in days are going to have rnk = 1
.
Then, using AVG
in the outer query, you are calculating the average value of Consumption
between all matching records.
Upvotes: 4