Reputation: 113
My company recently change database engine from paradox to ms sql 2008. It is way more faster, but we have problem with one query with low performance.
Query searches if fields NUMER from DM201211 table not exists in any of DO_WZ1, DO_WZ2 ... DO_WZ6 fields in DOK_SP table.
SELECT
Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR
WHERE NUMER NOT IN (SELECT DO_WZ1 FROM DOK_SP WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ2 FROM DOK_SP WHERE DO_WZ2 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ3 FROM DOK_SP WHERE DO_WZ3 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ4 FROM DOK_SP WHERE DO_WZ4 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ5 FROM DOK_SP WHERE DO_WZ5 IS NOT NULL AND ID_DOK<>'-1') AND NUMER NOT IN (SELECT DO_WZ6 FROM DOK_SP WHERE DO_WZ6 IS NOT NULL AND ID_DOK<>'-1')
GROUP BY Z.STATUS
That worked pretty well in paradox (few secs of execution). Now in MS SQL 2008 it executes about 17 seconds. After my short investigation I've find out that multiple subqueries cause this situation. Is there any chnce to optimize it with removing subqueries?
As you see all subqueries are done for the same table. I've tried with EXIST clause but I've only get down from 17 secs to 14 secs. I believe that there is better solution.
NOTE: T-SQL is not allowed for me to approach this!
Upvotes: 0
Views: 2290
Reputation: 4350
Paradox!? Oh my... are you still using Delphi 5 also? Anyway I guess if Paradox can handle it its not a large DB. Did you created indexes for that tables? A good indexing can solve performance issues. Just in case remember: with MS SQL creating FKs ill NOT create the indexes.
Upvotes: 0
Reputation: 1316
Try to rewrite the NOT IN parts of your query with NOT EXISTS like this:
WHERE NUMER NOT IN (SELECT DO_WZ1 FROM DOK_SP
WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1')
becomes
NOT EXISTS(SELECT DO_WZ1 FROM DOK_SP
WHERE DO_WZ1 = NUMER
AND ID_DOK<>'-1')
see here .
And if there is still performance problems, check the query plan and add missing index if needed If your DOC_SP is large, some indexes on (DO_ZW1, ID_OK), (DO_ZW2, ID_OK) etc could help (or one with all to make a seek+scan, depends of the data)
Another solution: add LEFT OUTER JOINS for each NOT IN and test nulls (ie not found). This is a common pattern
LEFT OUTER JOIN (SELECT DO_WZ1 FROM DOK_SP WHERE ID_DOK<>'-1') W1 ON NUMER=DO_WZ1
LEFT OUTER JOIN (SELECT DO_WZ2 FROM DOK_SP WHERE ID_DOK<>'-1') W2 ON NUMER=DO_WZ2
...
WHERE
W1.DO_WZ1 IS NULL
AND W2.DO_WZ2 IS NULL
..
Or, to avoid multiple joins, only one with OR in JOIN (will probably lead to a scan, an index with included columns will help)
SELECT
Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K
LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR
LEFT OUTER JOIN ( SELECT 1 as X, DO_WZ1, DO_WZ2 DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6
FROM DOK_SP
WHERE ID_DOK<>'-1') WZ
ON NUMER = WZ.DO_WZ1
OR NUMER = WZ.DO_WZ2
OR NUMER = WZ.DO_WZ3
OR NUMER = WZ.DO_WZ4
OR NUMER = WZ.DO_WZ5
OR NUMER = WZ.DO_WZ6
WHERE
WZ.X IS NULL
GROUP BY Z.STATUS
Upvotes: 1
Reputation: 20804
@Alain_Deloin suggested that you replace "not in" with "not exists". The reason is that "not in" is slow. However, "not in" and "not exists" do not always represent the same logic. That being the case, here is a very unintuitive yet effective way to replace "not in".
Do not do this, it's too slow.
where somefield not in
(select somefield
from sometable
where whatever
)
Do this instead.
where somefield in
(select somefield
from sometable
except
select somefield
from sometable
where whatever
)
Upvotes: 0
Reputation: 505
SELECT Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA=Z.NR
WHERE NUMER NOT IN (SELECT COALESCE(DO_WZ1, DO_WZ2, DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6)
FROM DOK_SP WHERE COALESCE(DO_WZ1, DO_WZ2, DO_WZ3, DO_WZ4, DO_WZ5, DO_WZ6) IS NOT NULL
AND ID_DOK<>'-1')
GROUP BY Z.STATUS
Upvotes: 2
Reputation: 516
SELECT
Z.STATUS,COUNT(K.NUMER) AS DOC_COUNT, MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K
LEFT OUTER JOIN ZLEC Z
ON K.DO_ZAMOWIENIA=Z.NR
WHERE NUMER NOT IN (
SELECT DO_WZ1 FROM DOK_SP WHERE DO_WZ1 IS NOT NULL AND ID_DOK<>'-1'
UNION
SELECT DO_WZ2 FROM DOK_SP WHERE DO_WZ2 IS NOT NULL AND ID_DOK<>'-1'
UNION
SELECT DO_WZ3 FROM DOK_SP WHERE DO_WZ3 IS NOT NULL AND ID_DOK<>'-1'
UNION
SELECT DO_WZ4 FROM DOK_SP WHERE DO_WZ4 IS NOT NULL AND ID_DOK<>'-1'
UNION
SELECT DO_WZ5 FROM DOK_SP WHERE DO_WZ5 IS NOT NULL AND ID_DOK<>'-1'
UNION
SELECT DO_WZ6 FROM DOK_SP WHERE DO_WZ6 IS NOT NULL AND ID_DOK<>'-1')
GROUP BY Z.STATUS
Upvotes: 0
Reputation: 1382
Try to update your query with this version
SELECT Z.STATUS ,
COUNT(K.NUMER) AS DOC_COUNT ,
MIN(K.ZDNIA) AS OLDEST
FROM DM201311 K
LEFT OUTER JOIN ZLEC Z ON K.DO_ZAMOWIENIA = Z.NR
WHERE NUMER NOT IN ( SELECT DO_WZ1
FROM DOK_SP
WHERE (DO_WZ1 IS NOT NULL AND DO_WZ2 IS NOT NULL AND DO_WZ3 IS NOT NULL AND DO_WZ4 IS NOT NULL AND DO_WZ5 IS NOT NULL AND DO_WZ6 IS NOT NULL)
AND ID_DOK <> '-1' )
GROUP BY Z.STATUS
If ID_DOK is a number you can change the <> '-1' to >= 0 Try to avoid negations, this will hit your performance.
Upvotes: 1