Reputation: 309
I have a table which stores attendances of students for each day. I need to get students who are consecutively absent for 3 days. However, the dates when attendance is taken is not in a order, some days like nonattendance, holidays, weekends are excluded. The dates when students attended are the dates where records exist in that table .
The data is like
StudentId Date Attendance
-----------------------------------------
178234 1/1/2017 P
178234 5/1/2107 A
178234 6/1/2107 A
178234 11/1/2107 A
178432 1/1/2107 P
178432 5/1/2107 A
178432 6/1/2107 P
178432 11/1/2107 A
In the above case the result should be
StudentId AbsenceStartDate AbsenceEndDate ConsecutiveAbsences
----------------------------------------------------------------------------
178234 5/1/2017 11/1/2017 3
I have tried to implement this solution Calculating Consecutive Absences in SQL However that only worked for dates in order only. Any suggestions will be great, thanks
Upvotes: 3
Views: 3672
Reputation: 146557
try this:
declare @t table (sid int, d date, att char(1))
insert @t (sid,d, att) values
(178234, '1/1/2017','P'),
(178234, '5/1/2017','A'),
(178234, '6/1/2017','A'),
(178234, '11/1/2017','A'),
(178432, '1/1/2017','P'),
(178432, '5/1/2017','A'),
(178432, '6/1/2017','P'),
(178432, '11/1/2017','A')
Select s.sid, Min(s.d) startDt, Max(e.d) endDt, s.att, e.att, count(*)
from @t s join @t e on e.d <=
(select max(d) from @t m
Where sid = s.sid
and d > s.d
and att = 'A'
and not exists
(Select * from @t
where sid = s.sid
and d between s.d and m.d
and att = 'P'))
Where s.att = 'A'
and s.d = (Select Min(d) from @t
Where sid = s.sid
and d < e.d
and att = 'A')
group by s.sid, s.d, s.att, e.att
this is also tricky to explain:
basically, it joins the table to itself using aliases s (for start) and e (for end), where the s-row is the first row in a set of contiguous absences, and the e. rows are all following absences that are before the next date where the stud is present. This will generate a set of all the 'A' that do not have a P row within them. Then the sql groups by the appropriate values to return the earliest and latest date, and the count of rows, in each group.
The last where clause ensures that the s row is the first row in the group.
Upvotes: 1
Reputation: 1270583
Oh, you have both absences and presents in the table. You can use the difference of row_numbers()
approach:
select studentid, min(date), max(date)
from (select a.*,
row_number() over (partition by studentid order by date) as seqnum,
row_number() over (partition by studentid, attendance order by date) as seqnum_a
from attendance a
) a
where attendance = 'A'
group by studentid, (seqnum - seqnum_a)
having count(*) >= 3;
The difference of row numbers gets consecutive values that are the same. This is a little tricky to understand, but if you run the subquery, you should see how the difference is constant for consecutive absences or presents. You only care about absences, hence the where
in the outer query.
Upvotes: 7