Reputation: 338
I have the following data in a database:
Job ClockInDateTime ClockOutDateTime
MM00151509 2013-11-19 07:01 2013-11-19 09:20
MM00151800 2013-11-09 09:08 2013-11-20 11:36
MM00153591 2013-12-01 08:20 2013-12-03 08:15
MM00154121 2013-12-05 08:19 2013-12-05 10:32
I would like to take each row and show the the date along with the associated hours for that date. As you can see in the above result, many of the records span days between ClockInDateTime and ClockOutDateTime. The are thousands of Jobs(rows).
What would be the sql query to do this?
Example output for one Job above - MM00153591:
Job ClockInDateTime ClockOutDateTime Date Hours
MM00153591 2013-12-01 08:20 2013-12-03 08:26 12/1/2013 15.3333
MM00153591 2013-12-01 08:20 2013-12-03 08:26 12/2/2013 24
MM00153591 2013-12-01 08:20 2013-12-03 08:15 12/3/2013 8.25
Upvotes: 1
Views: 97
Reputation: 601
The trick here is that you want to want to get a row explosion resulting in 1 row for each day between the specified dates.
This can be achieved in a few ways, but one way is to join on a simple date table that has all possible dates in it (within a reasonable range), where the join criteria is on a range. Depending on your performance needs, this can be built up on the fly or built up in advance and just stored for later use.
Here's an example, assuming your source 3 columns are in a table called Job:
DECLARE @MinDate DATETIME
DECLARE @TotalDays INT
SELECT @MinDate = DATEADD(dd,-1, CONVERT(DATE,MIN(ClockInDateTime))),
@TotalDays = DATEDIFF(dd, CONVERT(DATE,MIN(ClockInDateTime)), CONVERT(DATE,MAX(ClockOutDateTime))) + 2
FROM Job
SELECT J.Job, J.ClockInDateTime, J.ClockOutDateTime,
CONVERT(DATE, Dt) as [Date],
CASE WHEN ClockInDateTime < Dt AND ClockOutDateTime >= DATEADD(dd, 1, Dt) THEN 24*60
WHEN ClockInDateTime >= Dt AND ClockOutDateTime >= DATEADD(dd, 1, Dt)
THEN DATEDIFF(mi, ClockInDateTime, DATEADD(dd, 1, Dt))
WHEN ClockInDateTime < Dt AND ClockOutDateTime < DATEADD(dd, 1, Dt)
THEN DATEDIFF(mi, Dt, ClockOutDateTime)
ELSE DATEDIFF(mi, ClockInDateTime, ClockOutDateTime)
END / 60.0 as Hours
FROM Job J
INNER JOIN (SELECT TOP (@TotalDays) DATEADD(dd, ROW_NUMBER() OVER (ORDER BY s1.object_id), @MinDate) as Dt
FROM sys.objects s1
CROSS JOIN sys.objects s2) as DateTable
ON Dt BETWEEN CONVERT(DATE,J.ClockInDateTime) AND CONVERT(DATE,J.ClockOutDateTime)
ORDER BY Job, [Date]
Some simple testing with your sample data yields what appear to be correct results, though for the job you called out I did get slightly different answers for the first day's total hours:
Job ClockInDateTime ClockOutDateTime Date Hours
MM00153591 2013-12-01 08:20 2013-12-03 08:26 12/1/2013 15.6666
MM00153591 2013-12-01 08:20 2013-12-03 08:26 12/2/2013 24
MM00153591 2013-12-01 08:20 2013-12-03 08:15 12/3/2013 8.25
Note: the method I'm using above to build the date table is a simple way to generate a "numbers table", but there are other, perhaps more reliable approaches as well: What is the best way to create and populate a numbers table?
EDIT: Here's the full script against the provided sample data, along with an alternate method that doesn't use a date table and instead loops for each day across the set. The full results of each method are provided unaltered also:
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#Table') IS NOT NULL DROP TABLE #Table
CREATE TABLE #Table (Job NVARCHAR(256), ClockInDateTime DATETIME, ClockOutDateTime DATETIME)
INSERT INTO #Table (Job, ClockInDateTime, ClockOutDateTime)
SELECT N'MM00151509', '2013-11-19 07:01', '2013-11-19 09:20'
UNION ALL SELECT N'MM00151800','2013-11-09 09:08','2013-11-20 11:36'
UNION ALL SELECT N'MM00153591','2013-12-01 08:20','2013-12-03 08:15'
UNION ALL SELECT N'MM00154121','2013-12-05 08:19','2013-12-05 10:32'
PRINT 'Method 1: Calculate as a subset of all possible days'
DECLARE @MinDate DATETIME
DECLARE @TotalDays INT
SELECT @MinDate = DATEADD(dd,-1, CONVERT(DATE,MIN(ClockInDateTime))),
@TotalDays = DATEDIFF(dd, CONVERT(DATE,MIN(ClockInDateTime)), CONVERT(DATE,MAX(ClockOutDateTime))) + 2
FROM #Table
SELECT J.Job, J.ClockInDateTime, J.ClockOutDateTime,
CONVERT(DATE, Dt) as [Date],
CASE WHEN ClockInDateTime < Dt AND ClockOutDateTime >= DATEADD(dd, 1, Dt) THEN 24*60
WHEN ClockInDateTime >= Dt AND ClockOutDateTime >= DATEADD(dd, 1, Dt)
THEN DATEDIFF(mi, ClockInDateTime, DATEADD(dd, 1, Dt))
WHEN ClockInDateTime < Dt AND ClockOutDateTime < DATEADD(dd, 1, Dt)
THEN DATEDIFF(mi, Dt, ClockOutDateTime)
ELSE DATEDIFF(mi, ClockInDateTime, ClockOutDateTime)
END / 60.0 as Hours
FROM #Table J
INNER JOIN (SELECT TOP (@TotalDays) DATEADD(dd, ROW_NUMBER() OVER (ORDER BY s1.object_id), @MinDate) as Dt
FROM sys.objects s1
CROSS JOIN sys.objects s2) as DateTable
ON Dt BETWEEN CONVERT(DATE,J.ClockInDateTime) AND CONVERT(DATE,J.ClockOutDateTime)
ORDER BY Job, [Date]
PRINT 'Method 2: Loop 1 day at a time'
GO
IF OBJECT_ID('dbo.udf_MinDate') IS NOT NULL DROP FUNCTION dbo.udf_MinDate
GO
IF OBJECT_ID('dbo.udf_MaxDate') IS NOT NULL DROP FUNCTION dbo.udf_MaxDate
GO
CREATE FUNCTION dbo.udf_MinDate( @Date1 DATETIME, @Date2 DATETIME)
RETURNS DATETIME
AS
BEGIN
RETURN CASE WHEN @Date1 < @Date2 THEN @Date1 ELSE @Date2 END
END
GO
CREATE FUNCTION dbo.udf_MaxDate( @Date1 DATETIME, @Date2 DATETIME)
RETURNS DATETIME
AS
BEGIN
RETURN CASE WHEN @Date1 > @Date2 THEN @Date1 ELSE @Date2 END
END
GO
IF OBJECT_ID('tempdb..#TempResult') IS NOT NULL
DROP TABLE #TempResult
CREATE TABLE #TempResult ( Job NVARCHAR(256), ClockInDateTime DATETIME, ClockOutDateTime DATETIME, [Date] DATE, [Hours] DECIMAL(18,6))
DECLARE @MaxDaysDifferent INT, @CurDayOffset INT
SELECT @MaxDaysDifferent = MAX(DATEDIFF(dd, ClockInDateTime, ClockOutDateTime))
FROM #Table
SET @CurDayoffset = 0
WHILE (@CurDayOffset <= @MaxDaysDifferent)
BEGIN
INSERT INTO #TempResult (Job, ClockInDateTime, ClockOutDateTime, [Date], Hours)
SELECT T.Job, T.ClockInDateTime, T.ClockOutDateTime,
DATEADD(dd, @CurDayOffset, CONVERT(DATE,T.ClockInDateTime)),
DATEDIFF(mi, dbo.udf_MaxDate(T.ClockInDateTime, DATEADD(dd, @CurDayOffset, CONVERT(DATE,T.ClockInDateTime))),
dbo.udf_MinDate(T.ClockOutDateTime, DATEADD(dd, @CurDayOffset + 1, CONVERT(DATE,T.ClockInDateTime)))) / 60.0 as [Hours]
FROM #Table T
WHERE DATEADD(dd, @CurDayOffset, CONVERT(DATE,T.ClockInDateTime)) <= T.ClockOutDateTime
SET @CurDayOffset = @CurDayOffset + 1
END
SELECT * FROM #TempResult
ORDER BY Job, [Date]
The full results of this query are:
Method 1: Calculate as a subset of all possible days
Job ClockInDateTime ClockOutDateTime Date Hours
------------------------------ ----------------------- ----------------------- ---------- ------------------------------
MM00151509 2013-11-19 07:01:00.000 2013-11-19 09:20:00.000 2013-11-19 2.316666
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-09 14.866666
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-10 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-11 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-12 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-13 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-14 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-15 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-16 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-17 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-18 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-19 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-20 11.600000
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-01 15.666666
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-02 24.000000
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-03 8.250000
MM00154121 2013-12-05 08:19:00.000 2013-12-05 10:32:00.000 2013-12-05 2.216666
Method 2: Loop 1 day at a time
Job ClockInDateTime ClockOutDateTime Date Hours
------------------------------ ----------------------- ----------------------- ---------- ------------------------------
MM00151509 2013-11-19 07:01:00.000 2013-11-19 09:20:00.000 2013-11-19 2.316666
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-09 14.866666
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-10 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-11 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-12 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-13 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-14 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-15 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-16 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-17 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-18 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-19 24.000000
MM00151800 2013-11-09 09:08:00.000 2013-11-20 11:36:00.000 2013-11-20 11.600000
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-01 15.666666
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-02 24.000000
MM00153591 2013-12-01 08:20:00.000 2013-12-03 08:15:00.000 2013-12-03 8.250000
MM00154121 2013-12-05 08:19:00.000 2013-12-05 10:32:00.000 2013-12-05 2.216666
Note that in the second approach I introduced some udfs just to make the hours calculation appear a little cleaner.
Also note that both methods produce the same results (this time I ran the sample on Sql Server 2012 express, but I'm not using any 2012 specific features).
Finally, regarding the one discrepancy between your original expected results and these results for the hours. I believe your expected results were incorrect. The amount of time between 08:20 and midnight is 15 hours and 40 minutes, which is 15.6667 hours, not 15.333 hours.
Upvotes: 1