teocka
teocka

Reputation: 137

Query Performance Improvement

Here two tables categorizing the same items (TBL1.ID and TBL2.Code) in different ways (Grp and Cat)

Each Item only belongs to one group (with different group names in two tables) Items in TBL2 are subset of items in TBL1

Missing Items of only existing Cat's in TBL2 with proper Cat needed.(A4 not required)

TBL1:

    Grp         ID
--------------------
    X1          A1        
    X1          B1        
    X1          C1        
    X1          D1        
    X2          A2        
    X2          B2        
    X2          C2        
    X2          D2        
    X3          A3        
    X3          B3        
    X4          A4        

TBL2:

   Cat         Code
--------------------
    1           A1        
    1           B1        
    1           C1        
    2           A2        
    2           B2        
    3           A3        
    5           A5        

Desired:

    ID         Grp         Cat        
---------------------------------
    D1          X1          1         
    C2          X2          2         
    D2          X2          2         
    B3          X3          3         

This Query works fine but for large number of records is too slow:

SELECT
  TBL1.ID
  ,TBL1.Grp
  ,(SELECT DISTINCT T2.Cat FROM TBL2 T2 WHERE T2.Code IN
  (SELECT T1.ID FROM TBL1 T1 WHERE T1.Grp = TBL1.Grp )) AS Cat
 
FROM TBL1
LEFT JOIN TBL2
ON TBL1.ID = TBL2.Code
WHERE TBL2.Code IS NULL
AND (SELECT DISTINCT T2.Cat FROM TBL2 T2 WHERE T2.Code IN
    (SELECT T1.ID FROM TBL1 T1 WHERE T1.Grp = TBL1.Grp ))  IS NOT NULL

Any solution with better performance?

Upvotes: 0

Views: 42

Answers (1)

IVNSTN
IVNSTN

Reputation: 9325

Original subquery has a major flaw: it is designed to be able to return more than one value but this will break the entire query. Change your data so that any group is connected to more than one category and it will fail (e.g. (1, 'A1') -> (2, 'A1')). DISTINCT does help you to avoid failure with given sample data but it cannot help against different values.

And you did not specify which category to choose if there are more then one matching a group. I used MAX:

SELECT
  t.ID
  ,t.Grp
  ,c.Cat
FROM TBL1 t
INNER JOIN (
  SELECT g.Grp, MAX(c.Cat) Cat 
  FROM TBL1 g
  INNER JOIN TBL2 c
  ON c.Code = g.ID
  GROUP BY g.Grp
) c
ON c.Grp = t.Grp
WHERE NOT EXISTS(SELECT 1 FROM TBL2 t2 WHERE t2.Code = t.ID)

here TBL1 is joined to list of all groups connected to any category. INNER JOIN will keep only groups that are connected to at least one category (equivalent to your NOT NULL). And I replaced your LEFT JOIN to NOT EXISTS because it is more lightweight operation for SQL SERVER whilst results he same.

Upvotes: 2

Related Questions