Reputation: 928
I have found a lot of questions and answers asking how to convert a date range to records per day, but I need the opposite and can't find anything yet.
So let's say I have this dataset:
User | Available
1 | 01-01-2019
1 | 02-01-2019
1 | 03-01-2019
1 | 04-01-2019
2 | 05-01-2019
2 | 06-01-2019
2 | 07-01-2019
2 | 10-01-2019
2 | 11-01-2019
2 | 12-01-2019
So we have user 1 who is available from 01/01/2019 to 04/01/2019. Then we have user 2 who is available from 05/01/2019 to 07/01/2019 and 10/01/2019 to 12/01/2019.
The result I am looking for should look like this:
User | Start | End
1 | 01-01-2019 | 04-01-2019
2 | 05-01-2019 | 07-01-2019
2 | 10-01-2019 | 12-01-2019
User 1 was fairly easy to calculate using min/max dates, but with the gaps of user 2, I am completely lost. Any suggestions?
Upvotes: 3
Views: 226
Reputation: 82020
I'm reading as MONTHS not DAYS
Example
Select [User]
,[Start] = min([Available])
,[End] = max([Available])
From (
Select *
,Grp = DateDiff(MONTH,'1900-01-01',[Available]) - Row_Number() over (Partition By [User] Order by [Available])
From YourTable
) A
Group By [User],[Grp]
Returns
User Start End
1 2019-01-01 2019-04-01
2 2019-05-01 2019-07-01
2 2019-10-01 2019-12-01
Upvotes: 1
Reputation: 14209
I had to do this before somewhere too, this is the solution I used. Basically use a row number split by your grouping columns and ordered by date, and additionally calculate the amount of days from a particular date onwards (any hard-coded day will work).
The key here is that while the row number increases 1 by 1, the anchor difference will only increase 1 by 1 if the days are consecutive. Thus, the rest between the anchor diff and the row number will stay the same only if there are consecutive dates, allowing you to group by and calculate min/max.
IF OBJECT_ID('tempdb..#Availabilities') IS NOT NULL
DROP TABLE #Availabilities
CREATE TABLE #Availabilities (
[User] INT,
Available DATE)
INSERT INTO #Availabilities
VALUES
(1, '2019-01-01'),
(1, '2019-01-02'),
(1, '2019-01-03'),
(1, '2019-01-04'),
(2, '2019-01-05'),
(2, '2019-01-06'),
(2, '2019-01-07'),
(2, '2019-01-10'),
(2, '2019-01-11'),
(2, '2019-01-12')
;WITH WindowFunctions AS
(
SELECT
A.[User],
A.Available,
AnchorDayDifference = DATEDIFF(DAY, '2018-01-01', A.Available),
RowNumber = ROW_NUMBER() OVER (PARTITION BY A.[User] ORDER BY A.Available)
FROM
#Availabilities AS A
)
SELECT
T.[User],
Start = MIN(T.Available),
[End] = MAX(T.Available)
FROM
WindowFunctions AS T
GROUP BY
T.[User],
T.AnchorDayDifference - T.RowNumber
Result:
User Start End
1 2019-01-01 2019-01-04
2 2019-01-05 2019-01-07
2 2019-01-10 2019-01-12
The WindowFunctions
values are (added the posterior rest result):
User Available AnchorDayDifference RowNumber GroupingRestResult
1 2019-01-01 365 1 364
1 2019-01-02 366 2 364
1 2019-01-03 367 3 364
1 2019-01-04 368 4 364
2 2019-01-05 369 1 368
2 2019-01-06 370 2 368
2 2019-01-07 371 3 368
2 2019-01-10 374 4 370
2 2019-01-11 375 5 370
2 2019-01-12 376 6 370
Upvotes: 3
Reputation: 96027
This is a "common" Groups and Island question. Provided you're on SQL Server 2012+ (and if you're not, it's time to upgrade) this gets you the result you're after:
USE Sandbox;
GO
WITH VTE AS(
SELECT V.[User],
CONVERT(date,Available,105) AS Available
FROM (VALUES(1,'01-01-2019'),
(1,'02-01-2019'),
(1,'03-01-2019'),
(1,'04-01-2019'),
(2,'05-01-2019'),
(2,'06-01-2019'),
(2,'07-01-2019'),
(2,'10-01-2019'),
(2,'11-01-2019'),
(2,'12-01-2019')) V([User],Available)),
Diffs AS(
SELECT V.[User],
V.Available,
DATEDIFF(DAY, LAG(V.Available,1,DATEADD(DAY, -1, V.Available)) OVER (PARTITION BY V.[User] ORDER BY V.Available), V.Available) AS Diff
FROM VTE V),
Groups AS(
SELECT D.[User],
D.Available,
COUNT(CASE WHEN D.Diff > 1 THEN 1 END) OVER (PARTITION BY D.[User] ORDER BY D.Available
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM Diffs D)
SELECT G.[User],
MIN(G.Available) AS [Start],
MAX(G.Available) AS [End]
FROM Groups G
GROUP BY G.[User],
G.Grp
ORDER BY G.[User],
[Start];
The first CTE Diffs
, excluding VTE
("Value Table Expression") for the sample data, gets the difference in days between the different rows. The second CTE Groups
then puts the dates into groups (surprise that), base on if the difference was more than 1. Then we can use those groups to get a MIN
and MAX
for that group in the final SELECT
.
Upvotes: 2