Christian Townsend
Christian Townsend

Reputation: 301

SQL: Find group of rows for consecutive days absent

I have the following Attendance table in my Microsoft SQL Server 2016:

ID         StudentID  Date          AbsenceReasonID
----------------------------------------------------
430957     10158      2018-02-02    2   
430958     10158      2018-02-03    2   
430959     10158      2018-02-04    11  
430960     12393      2018-03-15    9   
430961     1          2018-03-15    9   
430962     12400      2018-03-15    9   
430963     5959       2018-03-15    11  

I would like to have a query that retrieves a group of rows where 3 or MORE absences have occurred consecutively by the Date column for a single student (StudentID). Ideally, the following data after running the query would be

ID         StudentID  Date          AbsenceReasonID
----------------------------------------------------
430957     10158      2018-02-02    2   
430958     10158      2018-02-03    2   
430959     10158      2018-02-04    11  

Note that if a student is absent on a Friday, I would like that to carry through over the weekend to Monday (Disregard Weekend dates).

If anymore information is required to better assist you in assisting me, please let me know. I have used the following query as a starter but know it is not what I am looking for:

SELECT 
    CONVERT(datetime, A.DateOF, 103),
    A.SchoolNum, EI.FullName,
    COUNT(A.SchoolNum) as 'Absences'
FROM 
    Attendance A
INNER JOIN 
    EntityInformation EI ON EI.SchoolNum = A.SchoolNum AND EI.Deleted = 0
INNER JOIN 
    Enrolment E ON EI.SchoolNum = E.SchoolNum AND E.Deleted = 0
GROUP BY 
    A.SchoolNum, A.DateOf, FullName
HAVING 
    COUNT(A.SchoolNum) > 1
    AND A.DateOf = GETDATE()
    AND A.SchoolNum in (SELECT SchoolNum FROM Attendance A1 
                        WHERE A1.DateOf = A.DateOf -7)

This is more of a static solution that retrieves absences where the student's ID occurred twice in the past 7 days. This is neither consecutive or three or more days.

Upvotes: 2

Views: 1495

Answers (3)

EzLo
EzLo

Reputation: 14199

You can use this to find your absence ranges. In here I use a recursive CTE to number all days from a few years while at the same time record their week day. Then use another recursive CTE to join absence dates for the same student that are one day after another, considering weekends should be skipped (read the CASE WHEN on the join clause). At the end show each absence spree filtered by N successive days.

SET DATEFIRST 1 -- Monday = 1, Sunday = 7

;WITH Days AS
(
    -- Recursive anchor: hard-coded first date
    SELECT
        GeneratedDate = CONVERT(DATE, '2017-01-01')

    UNION ALL

    -- Recursive expression: all days until day X
    SELECT
        GeneratedDate = DATEADD(DAY, 1, D.GeneratedDate)
    FROM
        Days AS D
    WHERE
        DATEADD(DAY, 1, D.GeneratedDate) <= '2020-01-01'
),
NumberedDays AS
(
    SELECT
        GeneratedDate = D.GeneratedDate,
        DayOfWeek = DATEPART(WEEKDAY, D.GeneratedDate),
        DayNumber = ROW_NUMBER() OVER (ORDER BY D.GeneratedDate ASC)
    FROM
        Days AS D
),
AttendancesWithNumberedDays AS
(
    SELECT
        A.*,
        N.*
    FROM
        Attendance AS A
        INNER JOIN NumberedDays AS N ON A.Date = N.GeneratedDate
),
AbsenceSpree AS
(
    -- Recursive anchor: absence day with no previous absence, skipping weekends
    SELECT
        StartingAbsenceDate = A.Date,
        CurrentDateNumber = A.DayNumber,
        CurrentDateDayOfWeek = A.DayOfWeek,
        AbsenceDays = 1,
        StudentID = A.StudentID
    FROM
        AttendancesWithNumberedDays AS A
    WHERE
        NOT EXISTS (
            SELECT
                'no previous absence date'
            FROM
                AttendancesWithNumberedDays AS X
            WHERE
                X.StudentID = A.StudentID AND
                X.DayNumber = CASE A.DayOfWeek 
                    WHEN 1 THEN A.DayNumber - 3 -- When monday then friday (-3)
                    WHEN 7 THEN A.DayNumber - 2 -- When sunday then friday (-2)
                    ELSE A.DayNumber - 1 END)

    UNION ALL

    -- Recursive expression: find the next absence day, skipping weekends
    SELECT
        StartingAbsenceDate = S.StartingAbsenceDate,
        CurrentDateNumber = A.DayNumber,
        CurrentDateDayOfWeek = A.DayOfWeek,
        AbsenceDays = S.AbsenceDays + 1,
        StudentID = A.StudentID
    FROM
        AbsenceSpree AS S
        INNER JOIN AttendancesWithNumberedDays AS A ON
            S.StudentID = A.StudentID AND
            A.DayNumber = CASE S.CurrentDateDayOfWeek
                WHEN 5 THEN S.CurrentDateNumber + 3 -- When friday then monday (+3)
                WHEN 6 THEN S.CurrentDateNumber + 2 -- When saturday then monday (+2)
                ELSE S.CurrentDateNumber + 1 END
)
SELECT
    StudentID = A.StudentID,
    StartingAbsenceDate = A.StartingAbsenceDate,
    EndingAbsenceDate = MAX(N.GeneratedDate),
    AbsenceDays = MAX(A.AbsenceDays)
FROM
    AbsenceSpree AS A
    INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
GROUP BY
    A.StudentID,
    A.StartingAbsenceDate
HAVING
    MAX(A.AbsenceDays) >= 3
OPTION
    (MAXRECURSION 5000)

If you want to list the original Attendance table rows, you can replace the last select:

SELECT
    StudentID = A.StudentID,
    StartingAbsenceDate = A.StartingAbsenceDate,
    EndingAbsenceDate = MAX(N.GeneratedDate),
    AbsenceDays = MAX(A.AbsenceDays)
FROM
    AbsenceSpree AS A
    INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
GROUP BY
    A.StudentID,
    A.StartingAbsenceDate
HAVING
    MAX(A.AbsenceDays) >= 3

with this CTE + SELECT:

,
FilteredAbsenceSpree AS
(
    SELECT
        StudentID = A.StudentID,
        StartingAbsenceDate = A.StartingAbsenceDate,
        EndingAbsenceDate = MAX(N.GeneratedDate),
        AbsenceDays = MAX(A.AbsenceDays)
    FROM
        AbsenceSpree AS A
        INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
    GROUP BY
        A.StudentID,
        A.StartingAbsenceDate
    HAVING
        MAX(A.AbsenceDays) >= 3
)
SELECT
    A.*
FROM
    Attendance AS A
    INNER JOIN FilteredAbsenceSpree AS F ON A.StudentID = F.StudentID
WHERE
    A.Date BETWEEN F.StartingAbsenceDate AND F.EndingAbsenceDate
OPTION
    (MAXRECURSION 5000)

Upvotes: 1

kc2018
kc2018

Reputation: 1460

Try this:

CTE contains the absence dates when a student was absent on both the day before and the day after (excluding weekend). The 2 UNION at the end add back the first and last of each group and eliminate the duplicates.

with cte(id, studentId, dateof , absenceReasonId)
as
(
select a.* 
from attendance a
where exists (select 1 from attendance preva
              where preva.studentID = a.studentID
              and   datediff(day, preva.dateof, a.dateof)
                    <= (case when datepart(dw, preva.dateof) >= 5
                        then 8 - datepart(dw, preva.dateof)
                        else 1 
                        end)
              and preva.dateof < a.dateof)
and exists (select 1 from attendance nexta
              where nexta.studentID = a.studentID
              and   datediff(day, a.dateof, nexta.dateof)
                    <= (case when datepart(dw, a.dateof) >= 5
                        then 8 - datepart(dw, a.dateof)
                        else 1 
                        end)
              and nexta.dateof > a.dateof))              

select cte.*
from cte
union  -- use union to remove duplicates
select preva.* 
from
attendance preva
inner join
cte
on preva.studentID = cte.studentID
and preva.dateof < cte.dateof
and datediff(day, preva.dateof, cte.dateof)
                    <= (case when datepart(dw, preva.dateof) >= 5
                        then 8 - datepart(dw, preva.dateof)
                        else 1 
                        end) 
union
select nexta.*
from attendance nexta
inner join
cte
on nexta.studentID = cte.studentID
and   datediff(day, cte.dateof, nexta.dateof)
       <= (case when datepart(dw, cte.dateof) >= 5
                then 8 - datepart(dw, cte.dateof)
                else 1 
            end)
and nexta.dateof > cte.dateof  
order by studentId, dateof 

sqlfiddle

Upvotes: 0

iSR5
iSR5

Reputation: 3498

If you need to get the absence in a time period (let's say in past 7 days), then you can do something like this

 SELECT 
    ID,
    StudentID,
    [Date], 
    AbsenceReasonID
FROM(
SELECT 
    ID,
    StudentID,
    [Date], 
    AbsenceReasonID, 
    COUNT(StudentID) OVER(PARTITION BY StudentID ORDER BY StudentID) AS con, 
    ((DATEPART(dw, [Date]) + @@DATEFIRST) % 7) AS dw
FROM attendance
) D
WHERE 
     D.con > 2
AND [Date] >= '2018-02-02'
AND [Date] <= GETDATE()
AND dw NOT IN(0,1)

and based on your given data the output will be

|     ID | StudentID |       Date | AbsenceReasonID |
|--------|-----------|------------|-----------------|
| 430957 |     10158 | 2018-02-02 |               2 |

you could adjust the output as you like.

SQL Fiddle

Upvotes: 0

Related Questions