KAHartle
KAHartle

Reputation: 135

LEFT JOIN on the same table doubling rows

I have two tables that I am trying to LEFT join but I am not getting the expected results.

Rooms have multiple Children on different days, however Children are only counted in a Room after they have started and if they have hours allocated on that day. The output I am trying to achieve is this.

Room  | MaxNum | Mon(Week1) | Tue(Week1) | Mon(Week2) | Tue(Week2)
Blue  | 5      | 4          | 4          | 3          | 2
Green | 10     | 10         | 10         | 9          | 9  
Red   | 15     | 15         | 15         | 15         | 15 

Here is the schema and some data...

create table Rooms(
  id       INT,
  RoomName VARCHAR(10),
  MaxNum   INT
);

create table Children (
  id          INT,
  RoomID      INT,
  MonHrs      INT,
  TueHrs      INT,
  StartDate   DATE
);

INSERT INTO Rooms VALUES (1, 'Blue', 5);
INSERT INTO Rooms VALUES (2, 'Green', 10);
INSERT INTO Rooms VALUES (3, 'Red', 15);

INSERT INTO Children VALUES (1, 1, 5, 0, '2018-12-02');
INSERT INTO Children VALUES (2, 1, 0, 5, '2018-12-02');
INSERT INTO Children VALUES (3, 1, 5, 5, '2018-12-09');
INSERT INTO Children VALUES (4, 1, 0, 5, '2018-12-09');
INSERT INTO Children VALUES (5, 2, 5, 0, '2018-12-09');
INSERT INTO Children VALUES (6, 2, 0, 5, '2018-12-09');

The SQL I am having trouble with is this. It may not be the correct approach.

SELECT R.RoomName, R.MaxNum,
       R.MaxNum - SUM(CASE WHEN C1.MonHrs > 0 THEN 1 ELSE 0 END) AS Mon1,
       R.MaxNum - SUM(CASE WHEN C1.TueHrs > 0 THEN 1 ELSE 0 END) AS Tue1,
       R.MaxNum - SUM(CASE WHEN C2.MonHrs > 0 THEN 1 ELSE 0 END) AS Mon2,
       R.MaxNum - SUM(CASE WHEN C2.TueHrs > 0 THEN 1 ELSE 0 END) AS Tue2
  FROM Rooms R
        LEFT JOIN Children C1 
          ON R.id = C1.RoomID
         AND C1.StartDate <= '2018-12-02'     
        LEFT JOIN Children C2
          ON R.id = C2.RoomID
         AND C2.StartDate <= '2018-12-09'
 GROUP BY R.RoomName;         

MySQL output

There is a double up happening on the Rows in the LEFT JOINs that is throwing the counts way off and I don't know how to prevent them. You can see the effect if you replace the SELECT with *

SELECT *

Any suggestions would help a lot.

Upvotes: 4

Views: 1193

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522636

This sort of problem usually surfaces from doing an aggregation in a too broad point in the query, which then results in duplicate counting of records. Try aggregating the Children table in a separate subquery:

SELECT
    R.RoomName,
    R.MaxNum,
    R.MaxNum - C.Mon1 AS Mon1,
    R.MaxNum - C.Tue1 AS Tue1,
    R.MaxNum - C.Mon2 AS Mon2,
    R.MaxNum - C.Tue2 AS Tue2
FROM Rooms R
LEFT JOIN
(
    SELECT
        RoomID,
        COUNT(CASE WHEN MonHrs > 0 AND StartDate <= '2018-12-02'
                   THEN 1 END) AS Mon1,
        COUNT(CASE WHEN TueHrs > 0 AND StartDate <= '2018-12-02'
                   THEN 1 END) AS Tue1,
        COUNT(CASE WHEN MonHrs > 0 AND StartDate <= '2018-12-09'
                   THEN 1 END) AS Mon2,
        COUNT(CASE WHEN TueHrs > 0 AND StartDate <= '2018-12-09'
                   THEN 1 END) AS Tue2
    FROM Children
    GROUP BY RoomID
) C
    ON R.id = C.RoomID;

Note that we can avoid the double left join in your original query by instead using conditional aggregation on the start date.

Late edit: You probably don't even need a subquery at all, q.v. the answer by @Salman. But either of our answers should resolve the double counting problem.

Upvotes: 2

Salman Arshad
Salman Arshad

Reputation: 272386

You need to use one LEFT JOIN and move the date filter from JOIN condition to the aggregate:

SELECT R.id, R.RoomName, R.MaxNum
    , R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-02' AND C.MonHrs > 0 THEN 1 END) AS Mon1
    , R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-02' AND C.TueHrs > 0 THEN 1 END) AS Tue1
    , R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-09' AND C.MonHrs > 0 THEN 1 END) AS Mon2
    , R.MaxNum - COUNT(CASE WHEN C.StartDate <= '2018-12-09' AND C.TueHrs > 0 THEN 1 END) AS Tue2
FROM Rooms R
LEFT JOIN Children C ON R.id = C.RoomID
GROUP BY R.id, R.RoomName, R.MaxNum

Upvotes: 1

Related Questions