Reputation: 47
I need to group and sum the rows of one day according to continuous worker date range.
Table attendance definition:
row_no NUMBER (*,0) NOT NULL, -- row number - generated from a sequence
worker_id NUMBER NOT NULL, -- Attendance worker id
date1 DATE DEFAULT SYSDATE NOT NULL, -- Attendance Date/time
type1 NUMBER(3,0) NOT NULL, -- Attendance type: 0-Enter, 1-Exit
worker_id date1 type1
2 13/06/2016-09:00 0
3 13/06/2016-12:10 0
2 13/06/2016-13:20 1
2 13/06/2016-15:00 0
2 13/06/2016-17:00 1
3 13/06/2016-18:45 1
2 13/06/2016-19:00 0
Result if report is run at 22:00
worker_id date1 fr_hour to_hour hours
2 13/06/2016 09:00 13:20 4:20
2 13/06/2016 15:00 17:00 2:00
2 13/06/2016 19:00 22:00 3:00
3 13/06/2016 12:10 18:45 6:35
Upvotes: 1
Views: 114
Reputation: 191235
Taking continuous periods that start on an earlier day complicates it a bit. You can either calculate all the ranges etc. for all dates, back to the start of time - assuming you don't archive off old records and the very first entry for any worker in the data isn't a check-out - and then after doing all that work filter on the date you're interested in. Or you can look only at that date's data and see if a worker's records start with a check-in or check-out.
I've added records for a fourth worker:
WORKER_ID DATE1 TYPE1
---------- ---------------- ----------
4 2016-06-12 19:00 0
4 2016-06-13 03:00 1
2 2016-06-13 09:00 0
3 2016-06-13 12:10 0
4 2016-06-13 13:00 0
2 2016-06-13 13:20 1
4 2016-06-13 14:30 1
2 2016-06-13 15:00 0
2 2016-06-13 17:00 1
3 2016-06-13 18:45 1
4 2016-06-13 19:00 0
2 2016-06-13 19:00 0
You can use analytic functions to work out a row number for each entry, and also find the first type1
value for each worker that day; this also effectively pivots to get the time in and out as separate columns:
select worker_id, trunc(date1) as date1, type1,
case when type1 = 0 then date1 end as time_in,
case when type1 = 1 then date1 end as time_out,
row_number() over (partition by worker_id, trunc(date1), type1 order by date1) as rn,
min(type1) keep (dense_rank first order by date1) over (partition by worker_id, trunc(date1)) as open_start,
max(type1) keep (dense_rank last order by date1) over (partition by worker_id, trunc(date1)) as open_end,
row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
- case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
over (partition by worker_id, trunc(date1)) else 0 end as grp
from attendance
where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
order by worker_id, attendance.date1;
WORKER_ID DATE1 TYPE1 TIME_IN TIME_OUT RN OPEN_START OPEN_END GRP
---------- ---------------- ---------- ---------------- ---------------- ---------- ---------- ---------- ----------
2 2016-06-13 00:00 0 2016-06-13 09:00 1 0 0 1
2 2016-06-13 00:00 1 2016-06-13 13:20 1 0 0 1
2 2016-06-13 00:00 0 2016-06-13 15:00 2 0 0 2
2 2016-06-13 00:00 1 2016-06-13 17:00 2 0 0 2
2 2016-06-13 00:00 0 2016-06-13 19:00 3 0 0 3
3 2016-06-13 00:00 0 2016-06-13 12:10 1 0 1 1
3 2016-06-13 00:00 1 2016-06-13 18:45 1 0 1 1
4 2016-06-13 00:00 1 2016-06-13 03:00 1 1 0 0
4 2016-06-13 00:00 0 2016-06-13 13:00 1 1 0 1
4 2016-06-13 00:00 1 2016-06-13 14:30 2 1 0 1
4 2016-06-13 00:00 0 2016-06-13 19:00 2 1 0 2
The rn
column is a raw (naive) attempt to group in and out records together, but for worker 4 that goes out of step. The open_start
works out if the first record was a check-out. The value that gets - either zero or 1 - can then be subtracted from rn
to get a more useful grouping flag, which I've called grp
.
You can then use that as an inline view or CTE and aggregate the time in/out records for each group, adding nvl()
or coalesce()
to put in missing midnight-start or 10pm-end values:
select worker_id,
date1 as date1,
nvl(min(time_in), date1) as fr_hour,
nvl(max(time_out), date1 + 22/24) as to_hour,
date1 + (nvl(max(time_out), date1 + 22/24) - date1)
- (nvl(min(time_in), date1) - date1) as hours
from (
select worker_id,
trunc(date1) as date1,
case when type1 = 0 then date1 end as time_in,
case when type1 = 1 then date1 end as time_out,
row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
- case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
over (partition by worker_id, trunc(date1)) else 0 end as grp
from attendance
where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
)
group by worker_id, date1, grp
order by worker_id, date1, grp;
WORKER_ID DATE1 FR_HOUR TO_HOUR HOURS
---------- ---------------- ---------------- ---------------- ----------------
2 2016-06-13 00:00 2016-06-13 09:00 2016-06-13 13:20 2016-06-13 04:20
2 2016-06-13 00:00 2016-06-13 15:00 2016-06-13 17:00 2016-06-13 02:00
2 2016-06-13 00:00 2016-06-13 19:00 2016-06-13 22:00 2016-06-13 03:00
3 2016-06-13 00:00 2016-06-13 12:10 2016-06-13 18:45 2016-06-13 06:35
4 2016-06-13 00:00 2016-06-13 00:00 2016-06-13 03:00 2016-06-13 03:00
4 2016-06-13 00:00 2016-06-13 13:00 2016-06-13 14:30 2016-06-13 01:30
4 2016-06-13 00:00 2016-06-13 19:00 2016-06-13 22:00 2016-06-13 03:00
The 'hours' value manipulates the date and the time-in/out values to come up with what looks like another time; but it's actually the elapsed time.
Finally you can format the columns to remove the bits you aren't interested in:
select worker_id,
to_char(date1, 'DD/MM/YYYY') as date1,
to_char(nvl(min(time_in), date1), 'HH24:MI') as fr_hour,
to_char(nvl(max(time_out), date1 + 22/24), 'HH24:MI') as to_hour,
to_char(date1 + (nvl(max(time_out),date1 + 22/24) - date1)
- (nvl(min(time_in), date1) - date1), 'HH24:MI') as hours
from (
select worker_id,
trunc(date1) as date1,
case when type1 = 0 then date1 end as time_in,
case when type1 = 1 then date1 end as time_out,
row_number() over (partition by worker_id, trunc(date1), type1 order by date1)
- case when type1 = 1 then min(type1) keep (dense_rank first order by date1)
over (partition by worker_id, trunc(date1)) else 0 end as grp
from attendance
where date1 >= date '2016-06-13' and date1 < date '2016-06-14'
)
group by worker_id, date1, grp
order by worker_id, date1, grp;
WORKER_ID DATE1 FR_HO TO_HO HOURS
---------- ---------- ----- ----- -----
2 13/06/2016 09:00 13:20 04:20
2 13/06/2016 15:00 17:00 02:00
2 13/06/2016 19:00 22:00 03:00
3 13/06/2016 12:10 18:45 06:35
4 13/06/2016 00:00 03:00 03:00
4 13/06/2016 13:00 14:30 01:30
4 13/06/2016 19:00 22:00 03:00
Upvotes: 0
Reputation: 1948
In the inner query we get for every row the date1,type1 from the next row (LEAD 1) for the same worker and than we filter only what we need:
SELECT worker_id,
TRUNC (date1) AS date1,
TO_CHAR (date1, 'HH24:MI') fr_hour,
TO_CHAR (date2, 'HH24:MI') to_hour,
TRUNC ( (date2 - date1) * 24) || ':' ||
TO_CHAR (TRUNC ( (date2 - date1) * 24 * 60) - TRUNC ( (date2 - date1) * 24) * 60, '00') hours
FROM (SELECT a.*,
LEAD (a.date1, 1) OVER (PARTITION BY worker_id ORDER BY date1) date2,
LEAD (a.type1, 1) OVER (PARTITION BY worker_id ORDER BY date1) type2
FROM testtemp a)
WHERE type1 = 0
AND type2 = 1
AND TRUNC (date1) = TRUNC (date2)
Upvotes: 1