Reputation: 1171
Consider the following schema;
CREATE TABLE `Project Assignment`
(`Employee` varchar(1), `Project Id` int, `Project Assignment Date` date, `Project Relieving Date` date)
;
INSERT INTO `Project Assignment`
(`Employee`, `Project Id`, `Project Assignment Date`, `Project Relieving Date`)
VALUES
('A', 1, '2018-04-01', '2019-12-25'),
('A', 2, '2019-06-15', '2020-03-31'),
('A', 3, '2019-09-07', '2020-05-20'),
('A', 4, '2020-07-14', '2020-12-15')
;
CREATE TABLE `Break`
(`Break Id` int, `Employee` varchar(1), `Project Id` int, `Break Start Date` date, `Break End Date` date)
;
INSERT INTO `Break`
(`Break Id`, `Employee`, `Project Id`, `Break Start Date`, `Break End Date`)
VALUES
(1, 'A', 1, '2018-09-01', '2018-09-30'),
(2, 'A', 1, '2019-10-05', '2019-11-30'),
(3, 'A', 2, '2019-10-15', '2019-11-15'),
(4, 'A', 3, '2019-11-01', '2019-11-10'),
(5, 'A', 2, '2020-01-01', '2020-01-10'),
(6, 'A', 3, '2020-01-01', '2020-01-10')
;
During a project, an employee can take one or more breaks in each Project. The breaks don't overlap within Project but can overlap across projects.
We want Count of Days on which an Employee had at least one project assigned (minus) the days on which the employee was on break on all assigned projects.
I was able to derive Distinct Days the Employee was assigned to Projects by using below query:
SELECT merged.employee,
SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT
s1.employee, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1
ON t1.employee = s1.employee
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.employee = s1.employee
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.employee = s1.employee
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.employee, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.employee
Result:
| employee | assigned_days |
| -------- | ------------- |
| A | 936 |
But couldn't think of a way to derive days on which the person was on break on all assigned projects.
Expected Result:
+----------+---------------+------------+-------------+
| employee | assigned_days | break_days | worked_days |
+==========+===============+============+=============+
| A | 936 | 50 | 886 |
+----------+---------------+------------+-------------+
Mariadb 10.3.29
Explanation of working out break_days
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| Employee | Project | Break Start | Break End | Days Considered | Remarks |
+==========+=========+=============+==================+=================+===================================================================================================================+
| A | 1 | 2018-09-01 | 2018-09-30 | 30 | Only one project assigned so consider whole break |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 1 | 2019-10-05 | 2019-11-30 | 10 | 3 Projects were assigned during these breaks. The common days of break fall between 2019-11-01 and 2019-11-10 |
+----------+---------+-------------+------------------+ | |
| A | 2 | 2019-10-15 | 2019-11-15 | | |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2019-11-01 | 2019-11-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 2 | 2020-01-01 | 2020-01-10 | 10 | 2 Projects were assigned during this time and break in both projects |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2020-01-01 | 2020-01-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| | | | Total Break Days | 50 | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
Link for DB-Fiddle: https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/0
Upvotes: 0
Views: 114
Reputation: 1171
After adding Break Id
column to Break
table, I could building upon aggregation technique suggested by @forpass to derive break days:
Then, for each date in both cases, with aggregation get all the projects as a comma separated list with GROUP_CONCAT().
For each break get the count and list of overlapping projects(using GROUP_CONCAT).
Then join it again over Break
to find count and list of overlapping breaks along with the smallest common overlap (lastest start & earliest end). Use ROW_NUMBER
to eliminate duplicates.
Move query for Assigned Days into another CTE and joined with CTE for breaks to get the desired result.
WITH breaks_summary AS (
SELECT `Employee`, SUM(break_days) break_days
FROM (
SELECT b.`Employee`, DATEDIFF(b.end_date, b.start_date)+1 break_days, ROW_NUMBER() OVER (PARTITION BY b.break_ids) rn, overlapping_breaks, break_ids, projects_count
FROM (
SELECT b_p_cnt.`Employee`, b_p_cnt.`Project Id`, b_p_cnt.projects_count,
COUNT(b2.`Break Id`) overlapping_breaks, GROUP_CONCAT(b2.`Break Id`) break_ids, MAX(b2.start_date) start_date, MIN(b2.end_date) end_date
FROM (
SELECT b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date, GROUP_CONCAT(pa.`Project Id`) projects, count(pa.`Project Id`) projects_count
FROM (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
) b1
LEFT JOIN `Project Assignment` pa ON b1.`Employee` = pa.`Employee`
AND ((b1.start_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE()))
OR (b1.end_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE())))
GROUP BY b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date) b_p_cnt
LEFT JOIN (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
ORDER BY `Break Id`) b2 ON b_p_cnt.`Employee` = b2.`Employee`
AND ((b_p_cnt.start_date BETWEEN b2.start_date AND b2.end_date)
OR (b_p_cnt.end_date BETWEEN b2.start_date AND b2.end_date))
GROUP BY b_p_cnt.`Break Id`, b_p_cnt.`Employee`, b_p_cnt.`Project Id`,
b_p_cnt.start_date, b_p_cnt.end_date, b_p_cnt.projects, b_p_cnt.projects_count
HAVING count(b2.`Break Id`) = b_p_cnt.projects_count
ORDER BY b_p_cnt.`Employee`, `Project Id`) b
) breaks
WHERE rn = 1
GROUP BY `Employee`),
assigned AS (
SELECT merged.`Employee`, SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT s1.`Employee`, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1 ON t1.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.`Employee` = s1.`Employee`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.`Employee`, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.`Employee`)
SELECT ad.`Employee`,
ad.assigned_days,
IFNULL(bs.break_days,0) break_days,
(ad.assigned_days - IFNULL(bs.break_days,0)) worked_days
FROM assigned ad
LEFT JOIN breaks_summary bs ON ad.`Employee` = bs.`Employee`
Updated DB-Fiddle with the query: https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/3
Thanks to all those who contributed by improving question as well as providing possible answers.
Upvotes: 0
Reputation: 164099
Use recursive CTEs to get all working and all break dates for each employee.
Then, for each date in both cases, with aggregation get all the projects as a comma separated list with GROUP_CONCAT()
.
If these lists match for a certain date then this is a break date.
WITH RECURSIVE
working_dates AS (
SELECT `Employee`, `Project Id`, `Project Assignment Date` AS date, `Project Relieving Date`
FROM `Project Assignment`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Project Relieving Date`
FROM working_dates
WHERE date < `Project Relieving Date`
),
break_dates AS (
SELECT `Employee`, `Project Id`, `Break Start Date` AS date, `Break End Date`
FROM `Break`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Break End Date`
FROM break_dates
WHERE date < `Break End Date`
),
working AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM working_dates
GROUP BY `Employee`, date
),
breaks AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM break_dates
GROUP BY `Employee`, date
)
SELECT w.`Employee`,
COUNT(*) assigned_days,
COUNT(b.date) AS break_days,
COUNT(*) - COUNT(b.date) worked_days
FROM working w LEFT JOIN breaks b
ON w.`Employee` = b.`Employee` AND w.date = b.date AND w.projects = b.projects
GROUP BY w.`Employee`
See the demo.
Upvotes: 1