yarpi87
yarpi87

Reputation: 113

Multiple subqueries to same table - performance

My company recently change database engine from paradox to ms sql 2008. It is way more faster, but we have problem with one query with low performance.

Query searches if fields NUMER from DM201211 table not exists in any of DO_WZ1, DO_WZ2 ... DO_WZ6 fields in DOK_SP table.

SELECT
Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR   
WHERE NUMER NOT IN (SELECT DO_WZ1 FROM DOK_SP WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ2 FROM DOK_SP WHERE DO_WZ2 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ3 FROM DOK_SP WHERE DO_WZ3 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ4 FROM DOK_SP WHERE DO_WZ4 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ5 FROM DOK_SP WHERE DO_WZ5 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ6 FROM DOK_SP WHERE DO_WZ6 IS NOT NULL AND ID_DOK<>'-1') 
GROUP BY Z.STATUS 

That worked pretty well in paradox (few secs of execution). Now in MS SQL 2008 it executes about 17 seconds. After my short investigation I've find out that multiple subqueries cause this situation. Is there any chnce to optimize it with removing subqueries?

As you see all subqueries are done for the same table. I've tried with EXIST clause but I've only get down from 17 secs to 14 secs. I believe that there is better solution.

NOTE: T-SQL is not allowed for me to approach this!

Upvotes: 0

Views: 2290

Answers (6)

jean
jean

Reputation: 4350

Paradox!? Oh my... are you still using Delphi 5 also? Anyway I guess if Paradox can handle it its not a large DB. Did you created indexes for that tables? A good indexing can solve performance issues. Just in case remember: with MS SQL creating FKs ill NOT create the indexes.

Upvotes: 0

ARA
ARA

Reputation: 1316

Try to rewrite the NOT IN parts of your query with NOT EXISTS like this:

WHERE   NUMER NOT IN (SELECT DO_WZ1 FROM DOK_SP 
                    WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1') 

becomes

NOT EXISTS(SELECT DO_WZ1 FROM DOK_SP 
    WHERE DO_WZ1 = NUMER
        AND ID_DOK<>'-1') 

see here .

And if there is still performance problems, check the query plan and add missing index if needed If your DOC_SP is large, some indexes on (DO_ZW1, ID_OK), (DO_ZW2, ID_OK) etc could help (or one with all to make a seek+scan, depends of the data)

Another solution: add LEFT OUTER JOINS for each NOT IN and test nulls (ie not found). This is a common pattern

LEFT OUTER JOIN (SELECT DO_WZ1 FROM DOK_SP WHERE ID_DOK<>'-1') W1 ON NUMER=DO_WZ1
LEFT OUTER JOIN (SELECT DO_WZ2 FROM DOK_SP WHERE ID_DOK<>'-1') W2 ON NUMER=DO_WZ2
...
WHERE 
  W1.DO_WZ1 IS NULL
AND W2.DO_WZ2 IS NULL
.. 

Or, to avoid multiple joins, only one with OR in JOIN (will probably lead to a scan, an index with included columns will help)

SELECT
    Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K 
LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR   
LEFT OUTER JOIN ( SELECT 1 as X, DO_WZ1, DO_WZ2 DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6  
                    FROM DOK_SP 
                    WHERE ID_DOK<>'-1') WZ 
                       ON NUMER = WZ.DO_WZ1
                        OR NUMER = WZ.DO_WZ2
                        OR NUMER = WZ.DO_WZ3
                        OR NUMER = WZ.DO_WZ4
                        OR NUMER = WZ.DO_WZ5
                        OR NUMER = WZ.DO_WZ6
WHERE 
    WZ.X IS NULL
GROUP BY Z.STATUS 

Upvotes: 1

Dan Bracuk
Dan Bracuk

Reputation: 20804

@Alain_Deloin suggested that you replace "not in" with "not exists". The reason is that "not in" is slow. However, "not in" and "not exists" do not always represent the same logic. That being the case, here is a very unintuitive yet effective way to replace "not in".

Do not do this, it's too slow.

where somefield not in
(select somefield
 from sometable
 where whatever
 )

Do this instead.

where somefield in 
(select somefield
from sometable
except
select somefield
 from sometable
 where whatever
 )

Upvotes: 0

Saad Surya
Saad Surya

Reputation: 505

SELECT Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR   
WHERE NUMER NOT IN (SELECT COALESCE(DO_WZ1, DO_WZ2, DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6)
                    FROM DOK_SP WHERE COALESCE(DO_WZ1, DO_WZ2, DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6) IS NOT NULL
                    AND ID_DOK<>'-1')
GROUP BY Z.STATUS

Upvotes: 2

Gavin
Gavin

Reputation: 516

SELECT
Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K 
LEFT OUTER JOIN ZLEC Z 
ON K.DO_ZAMOWIENIA=Z.NR   
WHERE NUMER NOT IN (
SELECT DO_WZ1 FROM DOK_SP WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1' 
UNION
SELECT DO_WZ2 FROM DOK_SP WHERE DO_WZ2 IS NOT NULL AND ID_DOK<>'-1' 
UNION 
SELECT DO_WZ3 FROM DOK_SP WHERE DO_WZ3 IS NOT NULL AND ID_DOK<>'-1' 
UNION
SELECT DO_WZ4 FROM DOK_SP WHERE DO_WZ4 IS NOT NULL AND ID_DOK<>'-1' 
UNION
SELECT DO_WZ5 FROM DOK_SP WHERE DO_WZ5 IS NOT NULL AND ID_DOK<>'-1' 
UNION 
SELECT DO_WZ6 FROM DOK_SP WHERE DO_WZ6 IS NOT NULL AND ID_DOK<>'-1') 
GROUP BY Z.STATUS 

Upvotes: 0

Juan
Juan

Reputation: 1382

Try to update your query with this version

SELECT  Z.STATUS ,
        COUNT(K.NUMER) AS DOC_COUNT ,
        MIN(K.ZDNIA) AS OLDEST
FROM    DM201311 K
        LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA = Z.NR
WHERE   NUMER NOT IN ( SELECT   DO_WZ1
                       FROM     DOK_SP
                       WHERE    (DO_WZ1 IS NOT NULL AND DO_WZ2 IS NOT NULL AND DO_WZ3 IS NOT NULL AND DO_WZ4 IS NOT NULL AND DO_WZ5 IS NOT NULL AND DO_WZ6 IS NOT NULL)
                                AND ID_DOK <> '-1' )
GROUP BY Z.STATUS 

If ID_DOK is a number you can change the <> '-1' to >= 0 Try to avoid negations, this will hit your performance.

Upvotes: 1

Related Questions