user1375481
user1375481

Reputation: 309

SQL Server : finding consecutive absence counts for students over custom dates

I have a table which stores attendances of students for each day. I need to get students who are consecutively absent for 3 days. However, the dates when attendance is taken is not in a order, some days like nonattendance, holidays, weekends are excluded. The dates when students attended are the dates where records exist in that table .

The data is like

StudentId       Date           Attendance
-----------------------------------------
178234          1/1/2017          P
178234          5/1/2107          A
178234          6/1/2107          A
178234          11/1/2107         A
178432          1/1/2107          P
178432          5/1/2107          A
178432          6/1/2107          P
178432          11/1/2107         A

In the above case the result should be

StudentId        AbsenceStartDate     AbsenceEndDate     ConsecutiveAbsences
----------------------------------------------------------------------------
178234           5/1/2017             11/1/2017           3

I have tried to implement this solution Calculating Consecutive Absences in SQL However that only worked for dates in order only. Any suggestions will be great, thanks

Upvotes: 3

Views: 3672

Answers (2)

Charles Bretana
Charles Bretana

Reputation: 146557

try this:

declare @t table (sid int, d date, att char(1))
insert @t (sid,d, att) values
(178234, '1/1/2017','P'),
(178234, '5/1/2017','A'),
(178234, '6/1/2017','A'),
(178234, '11/1/2017','A'),
(178432, '1/1/2017','P'),
(178432, '5/1/2017','A'),
(178432, '6/1/2017','P'),
(178432, '11/1/2017','A')

Select s.sid, Min(s.d) startDt, Max(e.d) endDt, s.att, e.att, count(*)
from @t s join @t e on e.d <=  
   (select max(d) from @t m
    Where sid = s.sid
       and d > s.d
       and att = 'A'
       and not exists 
          (Select * from @t
           where sid = s.sid
              and d between s.d and m.d
              and att = 'P'))
Where s.att = 'A' 
   and s.d = (Select Min(d) from @t
              Where sid = s.sid
                and d < e.d
                and att = 'A')
group by s.sid, s.d, s.att, e.att

this is also tricky to explain: basically, it joins the table to itself using aliases s (for start) and e (for end), where the s-row is the first row in a set of contiguous absences, and the e. rows are all following absences that are before the next date where the stud is present. This will generate a set of all the 'A' that do not have a P row within them. Then the sql groups by the appropriate values to return the earliest and latest date, and the count of rows, in each group.
The last where clause ensures that the s row is the first row in the group.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270583

Oh, you have both absences and presents in the table. You can use the difference of row_numbers() approach:

select studentid, min(date), max(date)
from (select a.*,
             row_number() over (partition by studentid order by date) as seqnum,
             row_number() over (partition by studentid, attendance order by date) as seqnum_a
      from attendance a
     ) a
where attendance = 'A'
group by studentid, (seqnum - seqnum_a)
having count(*) >= 3;

The difference of row numbers gets consecutive values that are the same. This is a little tricky to understand, but if you run the subquery, you should see how the difference is constant for consecutive absences or presents. You only care about absences, hence the where in the outer query.

Upvotes: 7

Related Questions