kenneth2k1
kenneth2k1

Reputation: 63

Sequential grouping of start and end date ranges

I have quite a problem that I have not been able to figure out. I created a view that when run, it gives me placement data for traveling workers. The important part is the PlacementID, and the StartDate and EndDate:

TravelerID  FirstName   LastName    PlacementID StartDate   EndDate
--------------------------------------------------------------------
65648       Lori        Williams    106593      09/22/01    02/08/03
65648       Lori        Williams    392605      02/24/03    05/24/03
65648       Lori        Williams    477950      05/26/03    11/22/03
65648       Lori        Williams    600089      12/01/03    05/29/04
65648       Lori        Williams    717424      05/30/04    12/04/04
65648       Lori        Williams    832842      12/05/04    02/04/05
65648       Lori        Williams    867492      02/06/05    07/30/05
65648       Lori        Williams    979375      08/15/05    11/12/05
65648       Lori        Williams    1030555     11/14/05    05/13/06
65648       Lori        Williams    1155937     05/15/06    01/06/07
65648       Lori        Williams    1341007     01/07/07    01/05/08
65648       Lori        Williams    1709959     01/06/08    05/31/08
65648       Lori        Williams    1878735     06/01/08    07/19/08
65648       Lori        Williams    1937168     07/20/08    01/31/09

They would like to count consecutive placements as one, and use the PlacementID from the first placement. Please note PlacementIDs 600089,717424, and 832842 for example. Notice the start date of the next is one day after the end date of the previous. There are other consecutive placements further down the list also. So the desired output would be:

TravelerID  FirstName   LastName    PlacementID StartDate   EndDate
--------------------------------------------------------------------
65648       Lori        Williams    106593      09/22/01    02/08/03
65648       Lori        Williams    392605      02/24/03    05/24/03
65648       Lori        Williams    477950      05/26/03    11/22/03
65648       Lori        Williams    600089      12/01/03    02/04/05
65648       Lori        Williams    867492      02/06/05    07/30/05
65648       Lori        Williams    979375      08/15/05    11/12/05
65648       Lori        Williams    1030555     11/14/05    05/13/06
65648       Lori        Williams    1155937     05/15/06    01/31/09

Here's some code to generate the sample data:

CREATE TABLE [dbo].vw_PlacementData(
 TravelerID int
,FirstName varchar(255)
,LastName varchar(255)
,PlacementID int
,StartDate datetime
,EndDate datetime
) ON [PRIMARY]

INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',106593,'9/22/01','2/8/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',392605,'2/24/03','5/24/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',477950,'5/26/03','11/22/03')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',600089,'12/1/03','5/29/04')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',717424,'5/30/04','12/4/04')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',832842,'12/5/04','2/4/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',867492,'2/6/05','7/30/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',979375,'8/15/05','11/12/05')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1030555,'11/14/05','5/13/06')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1155937,'5/15/06','1/6/07')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1341007,'1/7/07','1/5/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1709959,'1/6/08','5/31/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1878735,'6/1/08','7/19/08')
INSERT INTO vw_PlacementData VALUES (65648,'Lori','Williams',1937168,'7/20/08','1/31/09')

` I tried to join the table to itself but it doesn't account for more than one consecutive placement. As you can see from the example, there could be 2 or 20 consecutive placements that need to be counted.

Any help with this is greatly appreciated!!!

Upvotes: 1

Views: 1407

Answers (2)

kenneth2k1
kenneth2k1

Reputation: 63

Someone offered a handy solution over at the msdn forums. Here is the syntax:

SELECT *
    ,CAST(NULL AS datetime) AS FinalStartDate
    ,CAST(NULL AS datetime) AS FinalEndDate 
INTO #temp
FROM vw_PlacementData

CREATE CLUSTERED INDEX IDX_Clust_temp ON #temp (TravelerID,PlacementID,StartDate,EndDate)

DECLARE 
     @TravelerID int
    ,@PlacementID int
    ,@StartDate datetime
    ,@EndDate datetime

SELECT TOP 1 
     @TravelerID =TravelerID 
    ,@PlacementID = PlacementID
    ,@StartDate  = StartDate
    ,@EndDate = EndDate
FROM vw_PlacementData
ORDER BY TravelerID,StartDate,EndDate

--SELECT TOP 1 @TravelerID,@PlacementID  ,@StartDate  ,@EndDate

UPDATE t
SET @StartDate = FinalStartDate = CASE 
                                    WHEN TravelerID = @TravelerID 
                                    AND PlacementID > @PlacementID 
                                    AND  StartDate = @EndDate + 1
                                  THEN @StartDate
                                  ELSE StartDate
                             END,
    @TravelerID =TravelerID
   ,@PlacementID = PlacementID 
 --,@StartDate  = FinalStartDate
   ,@EndDate = EndDate
 -- OUTPUT INSERTED.StartDate,INSERTED.EndDate
 FROM #temp t (TABLOCKX)
 OPTION (MAXDOP 1)

SELECT 
     TravelerID
    ,FirstName
    ,LastName
    ,MIN(PlacementID) AS PlacementID
    ,FinalStartDate AS StartDate
    ,MAX(EndDate) AS EndDate
FROM #temp
GROUP BY TravelerID,FirstName,LastName,FinalStartDate

DROP TABLE  #Temp

Upvotes: 1

supergrady
supergrady

Reputation: 1322

You can use a recursive CTE to join the placement segments together based on the start/end dates.

;WITH cteConsecutivePlacements AS(
    SELECT
         TravelerID
        ,FirstName
        ,LastName
        ,PlacementID
        ,StartDate
        ,EndDate
    FROM dbo.vw_PlacementData

    UNION ALL

    SELECT
         cte.TravelerID
        ,cte.FirstName
        ,cte.LastName
        ,cte.PlacementID
        ,cte.StartDate
        ,vpd.EndDate
    FROM cteConsecutivePlacements cte
    JOIN dbo.vw_PlacementData vpd
        ON cte.TravelerID = vpd.TravelerID
    WHERE DATEADD(DAY,1,cte.EndDate) = vpd.StartDate
)

The inner select statement uses the ROW_NUMBER function to identify the subsequent placements, and the aggregation in the outer select statement reduces the results to 1 row per consecutive placement, with the proper end date.

SELECT
     TravelerID
    ,FirstName
    ,LastName
    ,PlacementID
    ,StartDate
    ,MAX(EndDate) AS EndDate
FROM( 
    SELECT 
         TravelerID
        ,FirstName
        ,LastName
        ,PlacementID
        ,StartDate
        ,EndDate
        ,ROW_NUMBER() OVER(PARTITION BY TravelerID, FirstName, LastName, EndDate ORDER BY StartDate ASC) AS row_num
    FROM cteConsecutivePlacements   
)q
WHERE q.row_num = 1
GROUP BY TravelerID, FirstName, LastName, PlacementID, StartDate
ORDER BY PlacementID, EndDate

Upvotes: 0

Related Questions