Reputation: 1489
given the following table:
create table #T
(
user_id int,
project_id int,
datum datetime,
status varchar(10),
KM int
)
insert into #T values
(1, 1, '20160301 10:25', 'START', 1000),
(1, 1, '20160301 10:28', 'PASS', 1008),
(2, 2, '20160301 10:29', 'START', 2000),
(1, 1, '20160301 11:08', 'STOP', 1045),
(3, 3, '20160301 10:25', 'START', 3000),
(2, 2, '20160301 10:56', 'STOP', 2020),
(1, 4, '20160301 15:00', 'START', 1045),
(4, 5, '20160301 15:10', 'START', 400),
(1, 4, '20160301 15:10', 'PASS', 1060),
(1, 4, '20160301 15:20', 'PASS', 1080),
(1, 4, '20160301 15:30', 'STOP', 1080),
(4, 5, '20160301 15:40', 'STOP', 450),
(3, 3, '20160301 16:25', 'STOP', 3200)
I have to sum the length of a track between START and STOP statuses for a given user and project The expected result would be this:
user_id project_id datum TOTAL_KM
----------- ----------- ---------- -----------
1 1 2016-03-01 45
1 4 2016-03-01 35
2 2 2016-03-01 20
3 3 2016-03-01 200
4 5 2016-03-01 50
How can I achieve this without using a cluster? The performance is an issue (I have over 1 million records per month and we have to keep data for several years)
Explanation: We can ignore the records with the status "PASS". Basically we have to subtract the KM value of the START record from the STOP record for a given user and project. There can be several hundred records between a START and STOP (like describes in the sample data) The date should be the date of START (in case where we have an over midnight delivery)
I think I should have a SELECT with an OVER() clause but I don't know how to formulate my query to respect those conditions.
Any idea?
Upvotes: 1
Views: 165
Reputation: 15977
SELECT t.[user_id],
t.project_id,
cast(t.datum as date) as datum,
t1.KM- t.KM as KM
FROM #T t
INNER JOIN #T t1
ON t.[user_id]=t1.[user_id] and t.project_id = t1.project_id
WHERE t.[status] = 'START' and t1.[status] = 'STOP'
ORDER BY t.[user_id],
t.project_id,
cast(t.datum as date)
Output:
user_id project_id datum KM
----------- ----------- ---------- -----------
1 1 2016-03-01 45
1 4 2016-03-01 35
2 2 2016-03-01 20
3 3 2016-03-01 200
4 5 2016-03-01 50
(5 row(s) affected)
Upvotes: 2
Reputation: 177
This could be achieved by simple self join.
One of the example: (this may not be exact query but just an idea)
Select
a.user_id,
a.project_id,
b.datum as StartDate,
a.KM-b.KM as TotalKM
From #T a
Where status = 'STOP'
Join
(
Select user_id, project_id, KM From #t Where
status = 'START'
) b ON b.user_id = a.user_id, b.project_id = a.project_id
#T b
Upvotes: 0