Reputation: 63
I have quite a problem that I have not been able to figure out. I created a view that when run, it gives me placement data for traveling workers. The important part is the PlacementID
, and the StartDate
and EndDate
:
TravelerID FirstName LastName PlacementID StartDate EndDate
--------------------------------------------------------------------
65648 Lori Williams 106593 09/22/01 02/08/03
65648 Lori Williams 392605 02/24/03 05/24/03
65648 Lori Williams 477950 05/26/03 11/22/03
65648 Lori Williams 600089 12/01/03 05/29/04
65648 Lori Williams 717424 05/30/04 12/04/04
65648 Lori Williams 832842 12/05/04 02/04/05
65648 Lori Williams 867492 02/06/05 07/30/05
65648 Lori Williams 979375 08/15/05 11/12/05
65648 Lori Williams 1030555 11/14/05 05/13/06
65648 Lori Williams 1155937 05/15/06 01/06/07
65648 Lori Williams 1341007 01/07/07 01/05/08
65648 Lori Williams 1709959 01/06/08 05/31/08
65648 Lori Williams 1878735 06/01/08 07/19/08
65648 Lori Williams 1937168 07/20/08 01/31/09
They would like to count consecutive placements as one, and use the PlacementID
from the first placement. Please note PlacementID
s 600089,717424, and 832842 for example. Notice the start date of the next is one day after the end date of the previous. There are other consecutive placements further down the list also. So the desired output would be:
TravelerID FirstName LastName PlacementID StartDate EndDate
--------------------------------------------------------------------
65648 Lori Williams 106593 09/22/01 02/08/03
65648 Lori Williams 392605 02/24/03 05/24/03
65648 Lori Williams 477950 05/26/03 11/22/03
65648 Lori Williams 600089 12/01/03 02/04/05
65648 Lori Williams 867492 02/06/05 07/30/05
65648 Lori Williams 979375 08/15/05 11/12/05
65648 Lori Williams 1030555 11/14/05 05/13/06
65648 Lori Williams 1155937 05/15/06 01/31/09
Here's some code to generate the sample data:
CREATE TABLE [dbo].vw_PlacementData(
TravelerID int
,FirstName varchar(255)
,LastName varchar(255)
,PlacementID int
,StartDate datetime
,EndDate datetime
) ON [PRIMARY]
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',106593,'9/22/01','2/8/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',392605,'2/24/03','5/24/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',477950,'5/26/03','11/22/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',600089,'12/1/03','5/29/04')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',717424,'5/30/04','12/4/04')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',832842,'12/5/04','2/4/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',867492,'2/6/05','7/30/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',979375,'8/15/05','11/12/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1030555,'11/14/05','5/13/06')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1155937,'5/15/06','1/6/07')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1341007,'1/7/07','1/5/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1709959,'1/6/08','5/31/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1878735,'6/1/08','7/19/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1937168,'7/20/08','1/31/09')
` I tried to join the table to itself but it doesn't account for more than one consecutive placement. As you can see from the example, there could be 2 or 20 consecutive placements that need to be counted.
Any help with this is greatly appreciated!!!
Upvotes: 1
Views: 1407
Reputation: 63
Someone offered a handy solution over at the msdn forums. Here is the syntax:
SELECT *
,CAST(NULL AS datetime) AS FinalStartDate
,CAST(NULL AS datetime) AS FinalEndDate
INTO #temp
FROM vw_PlacementData
CREATE CLUSTERED INDEX IDX_Clust_temp ON #temp (TravelerID,PlacementID,StartDate,EndDate)
DECLARE
@TravelerID int
,@PlacementID int
,@StartDate datetime
,@EndDate datetime
SELECT TOP 1
@TravelerID =TravelerID
,@PlacementID = PlacementID
,@StartDate = StartDate
,@EndDate = EndDate
FROM vw_PlacementData
ORDER BY TravelerID,StartDate,EndDate
--SELECT TOP 1 @TravelerID,@PlacementID ,@StartDate ,@EndDate
UPDATE t
SET @StartDate = FinalStartDate = CASE
WHEN TravelerID = @TravelerID
AND PlacementID > @PlacementID
AND StartDate = @EndDate + 1
THEN @StartDate
ELSE StartDate
END,
@TravelerID =TravelerID
,@PlacementID = PlacementID
--,@StartDate = FinalStartDate
,@EndDate = EndDate
-- OUTPUT INSERTED.StartDate,INSERTED.EndDate
FROM #temp t (TABLOCKX)
OPTION (MAXDOP 1)
SELECT
TravelerID
,FirstName
,LastName
,MIN(PlacementID) AS PlacementID
,FinalStartDate AS StartDate
,MAX(EndDate) AS EndDate
FROM #temp
GROUP BY TravelerID,FirstName,LastName,FinalStartDate
DROP TABLE #Temp
Upvotes: 1
Reputation: 1322
You can use a recursive CTE to join the placement segments together based on the start/end dates.
;WITH cteConsecutivePlacements AS(
SELECT
TravelerID
,FirstName
,LastName
,PlacementID
,StartDate
,EndDate
FROM dbo.vw_PlacementData
UNION ALL
SELECT
cte.TravelerID
,cte.FirstName
,cte.LastName
,cte.PlacementID
,cte.StartDate
,vpd.EndDate
FROM cteConsecutivePlacements cte
JOIN dbo.vw_PlacementData vpd
ON cte.TravelerID = vpd.TravelerID
WHERE DATEADD(DAY,1,cte.EndDate) = vpd.StartDate
)
The inner select statement uses the ROW_NUMBER function to identify the subsequent placements, and the aggregation in the outer select statement reduces the results to 1 row per consecutive placement, with the proper end date.
SELECT
TravelerID
,FirstName
,LastName
,PlacementID
,StartDate
,MAX(EndDate) AS EndDate
FROM(
SELECT
TravelerID
,FirstName
,LastName
,PlacementID
,StartDate
,EndDate
,ROW_NUMBER() OVER(PARTITION BY TravelerID, FirstName, LastName, EndDate ORDER BY StartDate ASC) AS row_num
FROM cteConsecutivePlacements
)q
WHERE q.row_num = 1
GROUP BY TravelerID, FirstName, LastName, PlacementID, StartDate
ORDER BY PlacementID, EndDate
Upvotes: 0