Kilren
Kilren

Reputation: 415

aggregation on Multiple LEFT JOIN performance issue (SQL SERVER)

I have a performance issue regarding a quite simple query that run for more than 30 minutes:

SELECT   P.pID
    ,COUNT(T1.ID) AS NB1
    ,COUNT(T2.ID) AS NB2
    ,COUNT(T3.ID) AS NB3
    ,COUNT(T4.ID) AS NB4
    ,COUNT(T5.ID) AS NB5

FROM MainTable P 

LEFT OUTER JOIN Table1 T1 ON  P.pID = T1.pID
LEFT OUTER JOIN Table2 T2 ON  P.pID = T2.pID
LEFT OUTER JOIN Table3 T3 ON  P.pID = T3.pID
LEFT OUTER JOIN Table4 T4 ON  P.pID = T4.pID
LEFT OUTER JOIN Table5 T5 ON  P.pID = T5.pID

GROUP BY P.pID

Where as each query would reply in few ms:

ex.

  SELECT   P.pID
        ,COUNT(T1.ID) AS NB1

    FROM MainTable P 
    LEFT OUTER JOIN Table1 T1 ON  P.pID = T1.pID

    GROUP BY P.pID

If I don't use any aggregation (COUNT or anything else) the query run in few ms: ex. SELECT P.pID

FROM MainTable P 

LEFT OUTER JOIN Table1 T1 ON  P.pID = T1.pID
LEFT OUTER JOIN Table2 T2 ON  P.pID = T2.pID
LEFT OUTER JOIN Table3 T3 ON  P.pID = T3.pID
LEFT OUTER JOIN Table4 T4 ON  P.pID = T4.pID
LEFT OUTER JOIN Table5 T5 ON  P.pID = T5.pID

GROUP BY P.pID

Obviously all indexes are set etc... The only "slowing down" element is that pID is a varchar (50) but I can't change it and in my opinion this is not be the main problem here.

I used a workaround including union all that work fine but I really wondering why these is so long and how could I optimize this as aggregating over multiple left join are really common stuff in reporting project and should not be so slow.

thank you for your help.

[EDIT] thx to ARION i got a nice query working really nice.

But my main concern is about understanding what's wrong in the sql engine writing the query with multiple left join.

Tables descr would be :

Table P (500 rows) 
pID varchar(50) NOT NULL as primary key
p.* doesn't matter 

Table Tn (between 2000 and 8000 rows)
Tn.ID int NOT NULL as primary key 
pID varchar(50) NOT NULL as Foreign key

[EDIT] Thanks to Erland Sommarskog on social.msdn.microsoft.com that point me my error of analyse. -- detail about the answer

Keep in mind :
LEFT JOIN forms cartesian product

I was wrong assuming that the Cartesian product might have been filtered as I always refere to the same table.

thanks

Upvotes: 2

Views: 2760

Answers (2)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115660

You could also rewrite the query by first grouping by (in subqueries), then joining:

SELECT
    P.pID,
    T1.NB1,
    T2.NB2,
    T3.NB3,
    T4.NB4,
    T5.NB5
FROM MainTable P 
  LEFT JOIN
    (SELECT pID, COUNT(*) AS NB1 FROM Table1 GROUP BY pID) AS T1
    ON T1.pID = P.pID
  LEFT JOIN
    (SELECT pID, COUNT(*) AS NB2 FROM Table2 GROUP BY pID) AS T2
    ON T2.pID = P.pID
  LEFT JOIN
    (SELECT pID, COUNT(*) AS NB3 FROM Table3 GROUP BY pID) AS T3
    ON T3.pID = P.pID
  LEFT JOIN
    (SELECT pID, COUNT(*) AS NB4 FROM Table4 GROUP BY pID) AS T4
    ON T4.pID = P.pID
  LEFT JOIN
    (SELECT pID, COUNT(*) AS NB5 FROM Table5 GROUP BY pID) AS T5
    ON T5.pID = P.pID

This would be useful if you want to include in the results other aggregates, besides the COUNT(*), without having to run more correlated subqueries.

Upvotes: 2

Arion
Arion

Reputation: 31249

Maybe something like this:

SELECT
    P.pID,
    (SELECT COUNT(*) FROM Table1 T1 WHERE P.pID = T1.pID) AS NB1,
    (SELECT COUNT(*) FROM Table2 T2 WHERE P.pID = T2.pID) AS NB2,
    (SELECT COUNT(*) FROM Table3 T3 WHERE P.pID = T3.pID) AS NB3,
    (SELECT COUNT(*) FROM Table4 T4 WHERE P.pID = T4.pID) AS NB4,
    (SELECT COUNT(*) FROM Table5 T5 WHERE P.pID = T5.pID) AS NB5

FROM MainTable P 

Upvotes: 4

Related Questions