kimberly
kimberly

Reputation: 11

How to get Consecutive Day Count in SQL and break when no longer consecutive

I am trying to create a query that counts the number of consecutive days worked for employees. It should reset when there is a break in consecutive days and I can't get it to work. Any ideas would be greatly appreciated it!

This is my query. I'm trying to use the row_number function but I'm not sure if that is the correct thing to do:

WITH CTE AS (
      select distinct TOT.EMPLOYEEID, TOT.DATE
      from TOTALS TOT
      where TOT.EMPLOYEEID IN ('020576','1200823') and
            TOT.TIMEINSECONDS >= 14400 and
            TOT.DATE >= '2019-01-01'
     )
SELECT CTE.*,
       ROW_NUMBER() OVER (PARTITION BY EMPLOYEEID ORDER BY DATE) AS CONSECUTIVEDAYS
FROM CTE
ORDER BY EMPLOYEEID, DATE;

Here is the result:

EMPLOYEEID        DATE        CONSECUTIVEDAYS
020576        2019-01-01 00:00:00.000        1
020576        2019-01-02 00:00:00.000        2
020576        2019-01-03 00:00:00.000        3
020576        2019-01-04 00:00:00.000        4
020576        2019-01-07 00:00:00.000        5 <---- THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.

020576        2019-01-08 00:00:00.000        6
020576        2019-01-09 00:00:00.000        7
020576        2019-01-10 00:00:00.000        8
020576        2019-01-11 00:00:00.000        9
020576        2019-01-14 00:00:00.000        10 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.

020576        2019-01-15 00:00:00.000        11
020576        2019-01-16 00:00:00.000        12
020576        2019-01-17 00:00:00.000        13
020576        2019-01-21 00:00:00.000        14 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.

020576        2019-01-22 00:00:00.000        15
020576        2019-01-23 00:00:00.000        16
020576        2019-01-24 00:00:00.000        17
020576        2019-01-25 00:00:00.000        18
020576        2019-01-28 00:00:00.000        19 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.

020576        2019-01-29 00:00:00.000        20
020576        2019-01-30 00:00:00.000        21
020576        2019-01-31 00:00:00.000        22
1200823        2019-01-01 00:00:00.000        1
1200823        2019-01-02 00:00:00.000        2
1200823        2019-01-03 00:00:00.000        3
1200823        2019-01-04 00:00:00.000        4
1200823        2019-01-07 00:00:00.000        5 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.
1200823        2019-01-08 00:00:00.000        6
1200823        2019-01-09 00:00:00.000        7
1200823        2019-01-10 00:00:00.000        8
1200823        2019-01-11 00:00:00.000        9
1200823        2019-01-14 00:00:00.000        10 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.
1200823        2019-01-15 00:00:00.000        11
1200823        2019-01-16 00:00:00.000        12
1200823        2019-01-18 00:00:00.000        13 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.
1200823        2019-01-21 00:00:00.000        14 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.
1200823        2019-01-22 00:00:00.000        15
1200823        2019-01-23 00:00:00.000        16
1200823        2019-01-24 00:00:00.000        17
1200823        2019-01-25 00:00:00.000        18
1200823        2019-01-28 00:00:00.000        19 <---THIS SHOULD BE 1. THE COUNT SHOULD RESET BECAUSE THE DATES ARE NO LONGER CONSECUTIVE.
1200823        2019-01-29 00:00:00.000        20
1200823        2019-01-30 00:00:00.000        21

Upvotes: 1

Views: 2438

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

You can identify the groups of adjacent row numbers in different ways. One method is to use LAG() and determine where the groups start. Then a cumulative sum defines the group.

The final step is ROW_NUMBER() over each group to get the numbering that you want:

WITH CTE AS (
      select distinct TOT.EMPLOYEEID, TOT.DATE
      from TOTALS TOT
      where TOT.EMPLOYEEID IN ('020576','1200823') and
            TOT.TIMEINSECONDS >= 14400 and
            TOT.DATE >= '2019-01-01'
     )
SELECT t.*,
        ROW_NUMBER() OVER PARTITION BY EMPLOYEEID, GRP ORDER BY Date) as seqnum
FROM (SELECT CTE.*,
             SUM(CASE WHEN prev_date < date - interval '1 day' THEN 1 ELSE 0 END) OVER (PARTITION BY EMPLOYEEID ORDER BY DATE) as grp
             ROW_NUMBER() OVER (PARTITION BY EMPLOYEEID ORDER BY DATE) AS CONSECUTIVEDAYS
      FROM (SELECT CTE.*,
                   LAG(DATE) OVER (PARTITION BY EMPLOYEEID ORDER BY DATE) as prev_date
            FROM CTE
           ) t
     ) t
ORDER BY EMPLOYEEID, DATE;

Note that this uses generic date functions. Date handling is notoriously database specific. It should be easy enough to adapt to the database you are actually using.

Upvotes: 1

Related Questions