Reputation: 15
I'm trying to calculate the monthly working hours of each delivery guy. This is data of a delivery company, and each delivery guy has a different time where he starts his shift and ends his shift.
This is a sample of how the table looks like:
Type_ID | Delivery_Guy_ID | Date_Created |
---|---|---|
5000 | 210 | 2020-11-16 16:34:43 |
7000 | 210 | 2020-11-16 16:35:24 |
3000 | 210 | 2020-11-16 16:35:46 |
3000 | 210 | 2020-11-16 16:37:41 |
4000 | 210 | 2020-11-16 16:39:41 |
3000 | 210 | 2020-11-16 16:42:53 |
4000 | 210 | 2020-11-16 16:47:53 |
3000 | 210 | 2020-11-16 16:48:16 |
4000 | 210 | 2020-11-16 16:50:16 |
3000 | 210 | 2020-11-16 16:53:01 |
2000 | 210 | 2020-11-16 18:53:07 |
Type ID = 2000 refers to 'Shift Ended' and type ID = 7000 refers to 'Shift Started'. A delivery guy might start his shift and end his shift several times during the day.
This is the query I tried:
WITH working_hours_cte AS
(
SELECT *
FROM (
SELECT id AS id
,
type_id AS type_id
,
delivery_guy_id AS delivery_guy_id
,
Timezone('Africa/Cairo', date_created) as date_created
FROM delivery_guys_event
ORDER BY delivery_guy_id,
date(date_created),
date_created ) AS t
WHERE t.date_created >= date('11-01-2020')
AND t.date_created <= date('11-30-2020') ), get_shift_start_time_cte AS
(
SELECT DISTINCT
ON(
t.delivery_guy_id) t.* ,
t.date_created::timestamptz AS shift_start_time
FROM working_hours_cte AS t
WHERE t.type_id = 7000
GROUP BY 1,
2,
3,
4
ORDER BY t.delivery_guy_id,
date(date_created),
t.date_created), get_shift_end_time_cte AS
(
SELECT DISTINCT
ON(
t.delivery_guy_id) t.* ,
t.date_created::timestamptz AS shift_end_time
FROM working_hours_cte AS t
WHERE t.type_id = 2000
GROUP BY 1,
2,
3,
4
ORDER BY t.delivery_guy_id,
date(date_created),
t.date_created), get_working_hours_cte1 AS
(
SELECT base.* ,
(date_part('day', ap.shift_end_time::timestamptz - aa.shift_start_time::timestamptz) * 24 + date_part('hour', ap.shift_end_time::timestamptz - aa.shift_start_time::timestamptz)) * 60 + date_part('minute', ap.shift_end_time::timestamptz - aa.shift_start_time::timestamptz) AS working_hours
FROM working_hours_cte AS base
LEFT JOIN get_shift_start_time_cte AS aa
ON base.delivery_guy_id = aa.delivery_guy_id
LEFT JOIN get_shift_end_time_cte AS ap
ON base.delivery_guy_id = ap.delivery_guy_id)
SELECT u.NAME AS delivery_guy_name
,
round(sum(timings.working_hours::numeric),0)/60 AS working_hours
FROM get_working_hours_cte1 AS timings
LEFT JOIN delivery_guys_contacts AS u
ON timings.delivery_guy_id = u.id
WHERE timings.date_created >= date('11/01/2020')
AND timings.date_created <= date('11/30/2020')
GROUP BY delivery_guy_name
ORDER BY delivery_guy_name ASC
Desired Output:
Delivery_guy_name | Working_hours |
---|---|
John | 200 |
Michael | 150 |
Joe | 230 |
I tried to self-join as well but none seems to be giving the correct number of hours. Can someone please tell me where the problem is?
Upvotes: 0
Views: 270
Reputation: 222582
How about summing the time difference between the current row and the next row, excepted if the current row is an end of shift? That could be easily done with window functions:
select delivery_guy_id, sum(diff) as total_diff
from (
select delivery_guy_id,
lead(date_created) over(partition by delivery_guy_id order by date_created)
- date_created as diff
from mytable
) t
wxhere type_id <> 2000
group by delivery_guy_id
This would work if there are no record outside the start/end of ships - that is, an end of shift is always followed by a start of shift.
Upvotes: 1