Reputation: 135
I have two tables that I am trying to LEFT join but I am not getting the expected results.
Rooms have multiple Children on different days, however Children are only counted in a Room after they have started and if they have hours allocated on that day. The output I am trying to achieve is this.
Room | MaxNum | Mon(Week1) | Tue(Week1) | Mon(Week2) | Tue(Week2)
Blue | 5 | 4 | 4 | 3 | 2
Green | 10 | 10 | 10 | 9 | 9
Red | 15 | 15 | 15 | 15 | 15
Here is the schema and some data...
create table Rooms(
id INT,
RoomName VARCHAR(10),
MaxNum INT
);
create table Children (
id INT,
RoomID INT,
MonHrs INT,
TueHrs INT,
StartDate DATE
);
INSERT INTO Rooms VALUES (1, 'Blue', 5);
INSERT INTO Rooms VALUES (2, 'Green', 10);
INSERT INTO Rooms VALUES (3, 'Red', 15);
INSERT INTO Children VALUES (1, 1, 5, 0, '2018-12-02');
INSERT INTO Children VALUES (2, 1, 0, 5, '2018-12-02');
INSERT INTO Children VALUES (3, 1, 5, 5, '2018-12-09');
INSERT INTO Children VALUES (4, 1, 0, 5, '2018-12-09');
INSERT INTO Children VALUES (5, 2, 5, 0, '2018-12-09');
INSERT INTO Children VALUES (6, 2, 0, 5, '2018-12-09');
The SQL I am having trouble with is this. It may not be the correct approach.
SELECT R.RoomName, R.MaxNum,
R.MaxNum - SUM(CASE WHEN C1.MonHrs > 0 THEN 1 ELSE 0 END) AS Mon1,
R.MaxNum - SUM(CASE WHEN C1.TueHrs > 0 THEN 1 ELSE 0 END) AS Tue1,
R.MaxNum - SUM(CASE WHEN C2.MonHrs > 0 THEN 1 ELSE 0 END) AS Mon2,
R.MaxNum - SUM(CASE WHEN C2.TueHrs > 0 THEN 1 ELSE 0 END) AS Tue2
FROM Rooms R
LEFT JOIN Children C1
ON R.id = C1.RoomID
AND C1.StartDate <= '2018-12-02'
LEFT JOIN Children C2
ON R.id = C2.RoomID
AND C2.StartDate <= '2018-12-09'
GROUP BY R.RoomName;
There is a double up happening on the Rows in the LEFT JOINs that is throwing the counts way off and I don't know how to prevent them. You can see the effect if you replace the SELECT with *
Any suggestions would help a lot.
Upvotes: 4
Views: 1193
Reputation: 522636
This sort of problem usually surfaces from doing an aggregation in a too broad point in the query, which then results in duplicate counting of records. Try aggregating the Children
table in a separate subquery:
SELECT
R.RoomName,
R.MaxNum,
R.MaxNum - C.Mon1 AS Mon1,
R.MaxNum - C.Tue1 AS Tue1,
R.MaxNum - C.Mon2 AS Mon2,
R.MaxNum - C.Tue2 AS Tue2
FROM Rooms R
LEFT JOIN
(
SELECT
RoomID,
COUNT(CASE WHEN MonHrs > 0 AND StartDate <= '2018-12-02'
THEN 1 END) AS Mon1,
COUNT(CASE WHEN TueHrs > 0 AND StartDate <= '2018-12-02'
THEN 1 END) AS Tue1,
COUNT(CASE WHEN MonHrs > 0 AND StartDate <= '2018-12-09'
THEN 1 END) AS Mon2,
COUNT(CASE WHEN TueHrs > 0 AND StartDate <= '2018-12-09'
THEN 1 END) AS Tue2
FROM Children
GROUP BY RoomID
) C
ON R.id = C.RoomID;
Note that we can avoid the double left join in your original query by instead using conditional aggregation on the start date.
Late edit: You probably don't even need a subquery at all, q.v. the answer by @Salman. But either of our answers should resolve the double counting problem.
Upvotes: 2
Reputation: 272386
You need to use one LEFT JOIN and move the date filter from JOIN condition to the aggregate:
SELECT R.id, R.RoomName, R.MaxNum
, R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-02' AND C.MonHrs > 0 THEN 1 END) AS Mon1
, R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-02' AND C.TueHrs > 0 THEN 1 END) AS Tue1
, R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-09' AND C.MonHrs > 0 THEN 1 END) AS Mon2
, R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-09' AND C.TueHrs > 0 THEN 1 END) AS Tue2
FROM Rooms R
LEFT JOIN Children C ON R.id = C.RoomID
GROUP BY R.id, R.RoomName, R.MaxNum
Upvotes: 1