Reputation: 1813
In this fiddle, I've added data filter conditions on employee_name
, task_date
and region
in the JOIN
clause. It seems to me I'm doing something wrong here as, in the output, I can't see the data filtered.
I expect to see data only between the dates '2015-01-03'
& '2015-01-06'
whereas the current query is returning data equivalent to the date filter not being present.
Also, is there a better way to write this query as the fiddle only designed to get the correct SQL whereas the data in production environment is worth several years & the query never seems to complete (It could run even more than 30 mins & keep running happily until killed). The database is quite optimized for performance in case someone is wondering if tuning the DB may be an option.
Any pointers would be really helpful here please.
INPUT/OUTPUT
CODE
CREATE TABLE ForgeRock
(`task_date` date, `employee_name` varchar(7), `task_name` varchar(55), `region` varchar(100));
INSERT INTO ForgeRock
(`task_date`, `employee_name`, `task_name`, `region`)
VALUES
('2015-01-01', 'A', 'task A','USA'),
('2015-01-01', 'B', 'task B','Russia'),
('2015-01-01', 'C', 'task C','USA'),
('2015-01-01', 'D', 'task D','USA'),
('2015-01-02', 'A', 'task A','Russia'),
('2015-01-02', 'B', 'task B','Singapore'),
('2015-01-02', 'C', 'task C','USA'),
('2015-01-02', 'D', 'task D','USA'),
('2015-01-03', 'A', 'task C','Australia'),
('2015-01-03', 'B', 'task B','London'),
('2015-01-03', 'C', 'task D','USA'),
('2015-01-03', 'D', 'task A','USA'),
('2015-01-03', 'C', 'task C','London'),
('2015-01-04', 'A', 'task B','USA'),
('2015-01-04', 'B', 'task A','Singapore'),
('2015-01-04', 'C', 'task C','USA'),
('2015-01-04', 'D', 'task D','India'),
('2015-01-05', 'A', 'task F','USA'),
('2015-01-05', 'B', 'task F','USA'),
('2015-01-05', 'C', 'task G','China'),
('2015-01-05', 'D', 'task B','USA'),
('2015-01-06', 'A', 'task Y','USA'),
('2015-01-06', 'B', 'task X','USA'),
('2015-01-06', 'C', 'task E','USA'),
('2015-01-06', 'D', 'task R','USA'),
('2015-01-07', 'A', 'task W','China'),
('2015-01-07', 'B', 'task O','Russia'),
('2015-01-07', 'C', 'task P','USA'),
('2015-01-07', 'D', 'task S','London'),
('2015-01-07', 'C', 'task E','USA'),
('2015-01-08', 'A', 'task E','USA'),
('2015-01-08', 'B', 'task W','USA'),
('2015-01-08', 'C', 'task C','USA'),
('2015-01-08', 'D', 'task B','London');
SQL QUERY
SELECT task_date,
employee_name,
Group_concat(task_name)
FROM (
SELECT DISTINCT a.task_date,
a.employee_name,
CASE
WHEN b.employee_name IS NOT NULL
AND c.employee_name IS NULL THEN NULL
ELSE a.task_name
END AS task_name
FROM forgerock AS a
LEFT OUTER JOIN forgerock AS b
ON a.employee_name = b.employee_name = 'A'
AND a.task_date >= '2015-01-03'
AND a.task_date <= '2015-01-06'
AND b.task_date >= '2015-01-03'
AND b.task_date <= '2015-01-06'
AND a.task_date - 1 = b.task_date
AND a.region = b.region = 'USA'
LEFT OUTER JOIN forgerock AS c
ON a.employee_name = c.employee_name = 'A'
AND a.task_date >= '2015-01-03'
AND a.task_date <= '2015-01-06'
AND c.task_date >= '2015-01-03'
AND c.task_date <= '2015-01-06'
AND a.task_date - 1 = c.task_date
AND a.task_name <> c.task_name
AND a.region = c.region = 'USA'
ORDER BY a.task_date,
a.employee_name,
a.task_name) AS temp
GROUP BY task_date,
employee_name
Upvotes: 0
Views: 63
Reputation: 3864
Not sure what you are trying to achieve with this query, but all your joins are left outer joins so the a table (the one from the FROM) will never be filtered. You should always check the query plan which for your query simply states no where condition on the a table:
Simple fix would be to specify a WHERE condition on the a table or switch to inner joins.
As performance is concerned you should again have a look at the query plan. If you are filtering a large table with no index (you see only using where in the query plan, example:
) that usually is a sign to perform a deeper investigation.
Upvotes: 1
Reputation: 2480
The join is on the full table "a" no matter the date so you need to add the where (see the **) and no need to put it in the join. I'm not sure but the CASE with IS NULL doesn't work everytime, i prefer to use a coalesce :
SELECT task_date,
employee_name,
Group_concat(task_name)
FROM (
SELECT DISTINCT a.task_date,
a.employee_name,
CASE
WHEN b.employee_name IS NOT NULL
AND COALESCE(c.employee_name, '00') THEN '00'
ELSE a.task_name
END AS task_name
FROM forgerock AS a
LEFT OUTER JOIN forgerock AS b
ON a.employee_name = b.employee_name = 'A'
AND b.task_date >= '2015-01-03'
AND b.task_date <= '2015-01-06'
AND a.task_date - 1 = b.task_date
AND a.region = b.region = 'USA'
LEFT OUTER JOIN forgerock AS c
ON a.employee_name = c.employee_name = 'A'
AND c.task_date >= '2015-01-03'
AND c.task_date <= '2015-01-06'
AND a.task_date - 1 = c.task_date
AND a.task_name <> c.task_name
AND a.region = b.region = 'USA'
**WHERE a.task_date >= '2015-01-03' AND a.task_date <= '2015-01-06'**
ORDER BY a.task_date,
a.employee_name,
a.task_name) AS temp
GROUP BY task_date,
employee_name
Upvotes: 2