user915331
user915331

Reputation:

Merging records with consecutive dates in SQL

I have the following table schema:

RecordId    EmpID       AbsCode DateFrom    DateTo
---------------------------------------------------------------
666542      1511        AB      09/11/2011  10/11/2011
666986      1511        AB      11/11/2011  11/11/2011
666996      1511        EL      13/11/2011  17/11/2011
755485      1787        SL      01/11/2011  14/11/2011
758545      1787        SL      15/11/2011  26/11/2011
796956      1954        AB      09/11/2011  09/11/2011
799656      1367        AB      09/11/2011  09/11/2011
808845      1527        EL      16/11/2011  16/11/2011
823323      1527        EL      17/11/2011  17/11/2011
823669      1527        EL      18/11/2011  18/11/2011
899555      1123        AB      09/11/2011  09/11/2011
990990      1511        AB      12/11/2011  12/11/2011

As you can see, the data is entered separately for the same employee. Let's say he reported SL (Sick Leave) for one day, the data is entered, Then he called next day to report another sick leave for two days..etc. Now what I want is to merge these entries when presenting it to the client so that all consecutive Absences with the same absence code will be merged. For example the table above should look exactly like this:

EmpID       AbsCode DateFrom    DateTo
-------------------------------------------------------------
1511        AB      09/11/2011  12/11/2011
1511        EL      13/11/2011  17/11/2011
1787        SL      01/11/2011  26/11/2011
1954        AB      09/11/2011  09/11/2011
1367        AB      09/11/2011  09/11/2011
1527        EL      16/11/2011  18/11/2011
1123        AB      09/11/2011  09/11/2011

I am not an SQL guy, I can do it using a loop in C# to iterate a DataSet or DataReader but I hope to do this with T-SQL in a stored proc. I have found similar questions in StackOverFlow and checked them all, None of them applies for the above sample table.

EDIT: Sometimes I will Have a situations like this:

RecordId    EmpID       AbsCode DateFrom    DateTo
---------------------------------------------------------------
666542      1511        AB      09/11/2011  10/11/2011
666986      1511        AB      11/11/2011  25/12/2011

As you can see, this Employee had Absence from 9/11/2011 (d/M/yyyy) to (25/12/2011) but the client requested to have Absence list from 1st of December until 31st of December, So the result should be:

EmpID       AbsCode DateFrom    DateTo
-------------------------------------------------------------
1511        AB      01/12/2011  12/11/2011

So basically, It will show the result according to parameters supplied (from, to). If the record stared before the requested period, It will show it but at the same time it will show the start of the record according to the parameters supplied, Same applies from records endings after the (from, to) parameters.

Upvotes: 2

Views: 5528

Answers (3)

Stephen Turner
Stephen Turner

Reputation: 7314

This is a CTE so it'll all need to be executed as one, but I'll explain as I go.

First I'll set the parameters for the date range we are interested in:

DECLARE @StartDate DateTime; SET @StartDate = '2011-11-01';  
DECLARE @EndDate DateTime; SET @EndDate = '2011-11-30';  

Then I'll turn them into a list of dates using a recursive CTE

WITH 
    ValidDates ( ValidDate ) AS 
        (
            SELECT @StartDate 
                UNION ALL
            SELECT DateAdd(day, 1, ValidDate) 
                FROM ValidDates 
                WHERE ValidDate < @EndDate
        ),

By joining that with ranges in the original records I get a list of individual days absence.

Using a combination of row_number and datediff I can group consecutive dates. This assumes that there are no duplicates.

    DaysAbsent AS 
        (
            SELECT 
                  A.RecordID
                , A.EmpID
                , A.AbsCode
                , DateDiff(Day, @StartDate, D.ValidDate) 
                    - row_number() 
                        over (partition by A.EmpID, A.AbsCode  
                            order by D.ValidDate) AS DayGroup
                , D.ValidDate AS AbsentDay
            FROM 
                dbo.Absence A
                    INNER JOIN  
                ValidDates D
                    ON D.ValidDate >= DateFrom 
                       and  D.ValidDate <= DateTo 
        )

Now it's a simple select with min and max to turn it back into ranges.

SELECT 
      EmpID
    , AbsCode
    , MIN(AbsentDay) AS DateFrom
    , MAX(AbsentDay) AS DateTo
FROM
    DaysAbsent
GROUP BY
      EmpID
    , AbsCode
    , DayGroup

The DayGroup isn't needed in the output but is needed for the grouping, otherwise non consecutive groups will be collapsed into one.

Upvotes: 2

SQLMason
SQLMason

Reputation: 3275

This will get you the number of days that each employee was off. I think that they would want to see that over the date ranges. You can convert the varchars to dates by using: DATEDIFF(DAY, CONVERT(DATETIME,[DateFrom],103), CONVERT(DATETIME,[DateTo],103))

DECLARE @myTable TABLE
(
    RecordId    INT,
    EmpID       INT,
    AbsCode     VARCHAR(2),
    DateFrom    VARCHAR(12),
    DateTo      VARCHAR(12)
)
INSERT INTO @myTable
(
    RecordId,
    EmpID,
    AbsCode,
    DateFrom,
    DateTo
)
SELECT 666542, 1511, 'AB', '09/11/2011', '10/11/2011' UNION ALL 
SELECT 666986, 1511, 'AB', '11/11/2011', '11/11/2011' UNION ALL 
SELECT 666996, 1511, 'EL', '13/11/2011', '17/11/2011' UNION ALL 
SELECT 755485, 1787, 'SL', '01/11/2011', '14/11/2011' UNION ALL 
SELECT 758545, 1787, 'SL', '15/11/2011', '26/11/2011' UNION ALL 
SELECT 796956, 1954, 'AB', '09/11/2011', '09/11/2011' UNION ALL 
SELECT 799656, 1367, 'AB', '09/11/2011', '09/11/2011' UNION ALL 
SELECT 808845, 1527, 'EL', '16/11/2011', '16/11/2011' UNION ALL 
SELECT 823323, 1527, 'EL', '17/11/2011', '17/11/2011' UNION ALL 
SELECT 823669, 1527, 'EL', '18/11/2011', '18/11/2011' UNION ALL 
SELECT 899555, 1123, 'AB', '09/11/2011', '09/11/2011' UNION ALL 
SELECT 990990, 1511, 'AB', '12/11/2011', '12/11/2011'


SELECT [RecordId], [EmpID], [AbsCode], SUM(DAYS) NoDays
FROM
(
    SELECT [RecordId], [EmpID], [AbsCode], DATEDIFF(DAY, CONVERT(DATETIME,[DateFrom],103), CONVERT(DATETIME,[DateTo],103)) Days
    FROM @myTable
    GROUP BY [RecordId], [EmpID], [AbsCode], DATEDIFF(DAY, CONVERT(DATETIME,[DateFrom],103), CONVERT(DATETIME,[DateTo],103))
) subQuery
GROUP BY [RecordId], [EmpID], [AbsCode]

Upvotes: 1

Ta01
Ta01

Reputation: 31610

Select EmpId, AbsCode, MIN(DateFrom) as DateFrom, MAX(DateTo) as DateTo From YOURTABLE
Group By EmpId, AbsCode

Upvotes: 0

Related Questions