user1682461
user1682461

Reputation: 111

SQL Server Query to group sequential dates

I have a table named Absence Details and I want to group sequential dates. Here, is the data

EID        AbsenceType  AbsenceStartDate               AbsenceEndDate
769     Holiday     2012-06-25  00:00:00.000            2012-06-25 23:59:59.000
769     Holiday     2012-06-26  00:00:00.000            2012-06-26 23:59:59.000
769     Holiday     2012-09-03  00:00:00.000            2012-09-03 23:59:59.000
769     Holiday     2012-09-04  00:00:00.000            2012-09-04 23:59:59.000
769     Holiday     2012-09-05  00:00:00.000            2012-09-05 23:59:59.000
769     Holiday     2012-09-06  00:00:00.000            2012-09-06 23:59:59.000
769     Holiday     2012-09-07  00:00:00.000            2012-09-07 23:59:59.000

The result i am trying to get is

EID     AbsenceType AbsenceStartDate          AbsenceEndDate
769     Holiday     2012-06-25  00:00:00.000         2012-06-26 23:59:59.000
769     Holiday     2012-09-03  00:00:00.000         2012-09-07 23:59:59.000

Any help is much appreciated.

Upvotes: 7

Views: 1619

Answers (4)

user1682461
user1682461

Reputation: 111

Here is the solution that worked for me.

SELECT EID, AbsenceType, MIN(AbsenceStartDate) AS AbsenceStartDate, MAX(AbsenceEndDate) AS AbsenceEndDate
FROM (SELECT EID, AbsenceType, AbsenceStartDate, AbsenceEndDate,
      DATEADD(dd, - ROW_NUMBER() OVER (PARTITION BY EID, AbsenceType ORDER BY EID,AbsenceStartDate), AbsenceStartDate)
      FROM AbsenceDetails
      GROUP BY EID,AbsenceType,AbsenceStartDate,AbsenceEndDate
      ) a(EID, AbsenceType, AbsenceStartDate, AbsenceEndDate, Grp)
GROUP BY EID, AbsenceType, Grp;

Upvotes: 4

dani herrera
dani herrera

Reputation: 51735

I have simplified your scenario to isolate the main problem. Let's supose this table with gaps:

with ns as (
select 1 as n union
select 2 as n union
select 3 as n union
select 8 as n union    --gap
select 9 as n )
select * 
into #ns
from ns;

Now, the result you are expecting for is:

ini fi 
--- -- 
1   3  
8   9  

To get this results I massage the data in this way: first I create two views with start and end periods and second, I join both views to get final result. Notice that I join table with it self to locate starts and ends periods:

with 
inis as                                     -- identifying start periods
(
   select n1.n
   from #ns n1
   left outer join #ns n2 
       on n1.n = n2.n + 1
   where n2.n is null
   ),
fis as                                      -- identifying ends periods
( 
   select n1.n 
   from #ns n1
   left outer join #ns n2 
       on n1.n = n2.n - 1
   where n2.n is null
   )  
select inis.n as ini, min( fis.n ) as fi    -- joining starts and ends
from inis 
inner join fis 
  on inis.n <= fis.n
group by inis.n
;
   

You can transfer this technique to your data and data types. If you have any issue translating query be free to ask.

Check query and results.

Upvotes: 4

Germann Arlington
Germann Arlington

Reputation: 3353

If I understood your question correctly you want to find continuous time intervals in your records.
The main problem will be identifying what actually constitutes continuous time interval:
If you are looking at absence at work than any sequence of

date1.09:00 to date1.18:00  
date2.09:00 to date2.18:00  

where date2 is a next business day after date1 can be considered continuous.

In your case it is relatively easy, but you will not be able to do it in a single query. At least I can't think of a way to do it right now.

P.S. "Islands and Gaps" algorithm suggested by "podiluska" will help you to write it in single query/stored procedure.

Upvotes: 0

Vikdor
Vikdor

Reputation: 24134

I would do it as follows:

  1. Identify the list of sequence of absence dates.

    SELECT
        ad1.EID, ad1.StartDate, ad2.EndDate
    FROM 
        AbsenceDetails ad1
        JOIN AbsenceDetails ad2
        ON ad1.EID = ad2.EID
    WHERE
        DATEDIFF(ss, ad1.EndDate, ad2.StartDate) = 1
    

    The results would be as follows:

    769 2012-06-25 00:00:00.000 2012-06-26 23:59:59.000
    769 2012-09-03 00:00:00.000 2012-09-04 23:59:59.000
    769 2012-09-04 00:00:00.000 2012-09-05 23:59:59.000
    769 2012-09-05 00:00:00.000 2012-09-06 23:59:59.000
    769 2012-09-06 00:00:00.000 2012-09-07 23:59:59.000
    
  2. Iterate through the list and identify the start and end duration of each stretch. This is better done at the app layer.

Upvotes: 1

Related Questions