Asha
Asha

Reputation: 4311

tsql best way to merge records with start date and end date when there is no gap in between

I have a mapping table as follows :

FirstEntityID int
MappedTo int
BeginDate Date
EndDate Date

and lets say I have following records in the table :

FirstEntityID  MappedTo  BeginDate     EndDate 
     1             2     2012-09-01   2012-10-01
     2             3     2012-09-01   2012-10-01
     1             2     2012-10-02   2012-11-24
     2             3     2012-11-01   2012-11-24

I need a script which will get this table and merges records based on the Start and end date to return a result like :

   FirstEntityID  MappedTo  BeginDate     EndDate 
     1             2     2012-09-01   2012-11-24
     2             3     2012-09-01   2012-10-01
     2             3     2012-11-01   2012-11-24

Upvotes: 1

Views: 1409

Answers (3)

paparazzo
paparazzo

Reputation: 45096

Tested this and it seems to work

It will fail on an edge case with duplicate rows.
For that would need to go with a RowNumber approach like Ypercube.
Or add a constraint on the table to force the row to be unique.

-- first the overlaps
SELECT T1.FirstEntityId, T1.MappedTo, T1.BeginDate, Max(T2.EndDate) as [EndDate]
FROM tablex as T1
join tablex as T2  
  on  T1.FirstEntityId = T2.FirstEntityId
  and T1.MappedTo = T2.MappedTo
  and T1.EndDate >= T2.BeginDate 
  and T1.EndDate <  T2.EndDate
  and T1.BeginDate <= T2.BeginDate
GROUP BY T1.FirstEntityId, T1.MappedTo, T1.BeginDate  

union

-- add the non overlaps
SELECT T1.FirstEntityId, T1.MappedTo, T1.BeginDate, T1.EndDate
FROM tablex as T1
join tablex as T2  
  on  T1.FirstEntityId = T2.FirstEntityId
  and T1.MappedTo = T2.MappedTo
  and (  T1.EndDate  < T2.BeginDate or T1.BeginDate > T2.EndDate 
   or   (T1.BeginDate  < T2.BeginDate and T1.EndDate > T2.EndDate) )

order by FirstEntityId, MappedTo, BeginDate

Upvotes: 1

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115510

Using CTEs, we find the starting dates first:

; WITH StartD AS
  ( SELECT 
        FirstEntityID
      , MappedTo
      , BeginDate
      , ROW_NUMBER() OVER( PARTITION BY FirstEntityID, MappedTo
                           ORDER BY BeginDate )
          AS Rn 
    FROM
        tableX AS t
    WHERE
        NOT EXISTS
        ( SELECT *
          FROM tableX AS p
          WHERE p.FirstEntityID = t.FirstEntityID
            AND p.MappedTo = t.MappedTo
            AND p.BeginDate < t.BeginDate 
            AND t.BeginDate <= DATEADD(day, 1, p.EndDate)
        )
  )

then the ending dates:

, EndD AS
  ( SELECT 
        FirstEntityID
      , MappedTo
      , EndDate
      , ROW_NUMBER() OVER( PARTITION BY FirstEntityID, MappedTo
                           ORDER BY EndDate )
          AS Rn 
    FROM
        tableX AS t
    WHERE
        NOT EXISTS
        ( SELECT *
          FROM tableX AS p
          WHERE p.FirstEntityID = t.FirstEntityID
            AND p.MappedTo = t.MappedTo
            AND DATEADD(day, -1, p.BeginDate) <= t.EndDate
            AND t.EndDate < p.EndDate
        )
  )

and the final result:

SELECT
    s.FirstEntityID
  , s.MappedTo
  , s.BeginDate
  , e.EndDate
FROM
    StartD AS s
  JOIN
    EndD AS e
      ON  e.FirstEntityID = s.FirstEntityID
      AND e.MappedTo = s.MappedTo
      AND e.Rn = s.Rn ;

Tested in SQL-Fiddle

Upvotes: 3

Michael Yoon
Michael Yoon

Reputation: 1606

This might work:

SELECT FirstEntityId, MappedTo, Min(BeginDate), Max(EndDate)
FROM
    T1
GROUP BY
    FirstEntityId, MappedTo

Upvotes: 0

Related Questions