Reputation: 179
I have performence issues with the following query :
SELECT A,B,C,D,E,F
FROM TABLE1 T1
INNER JOIN TABLE2 T2
ON (((T1.E IS NULL OR T2.E IS NULL) AND T1.F= T2.F)
OR((T1.E IS NOT NULL OR T2.E IS NOT NULL) AND T1.E = T2.E))
More than 30 min to return about 1000 rows
I've tried this :
SELECT A,B,C,D,E,F
FROM TABLE1 T1
INNER JOIN TABLE2 T2
ON (((COALESCE(T1.E,-1) = COALESCE(T2.E,-1)
AND ((T1.F= T2.F)
OR(T1.E = T2.E)))))
but gives less results than the first one
Can you help me to find another way to write it in oreder to reduce execution time please ?
I'm using SQL Server 2016
Upvotes: 1
Views: 72
Reputation: 415690
Try this:
SELECT A,B,C,D,E,F
FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.F = T2.F
WHERE T1.E IS NULL OR T2.E IS NULL
UNION
SELECT A,B,C,D,E,F
FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.E = T2.E
WHERE COALESCE(T1.E, T2.E) IS NOT NULL
You might want a UNION ALL
, but this should match the original.
This also exposes an interesting quirk in the original logic you may want to reconsider. If the E
field from one table is NULL
, but not the other, the original code would make checks on both the E
and F
fields. Which is interesting, because for the E
field we know one side is null, but the other is not, so that case can't ever be true... but the logic says to still make the comparison.
It's hard to know what you're doing with the generic names, but there's definitely room to clean up that conditional check. Before worrying about matching to your first results, go back and make sure those first results clearly and accurately state what you want to accomplish, even if that means making the query even slower or longer.
Then, only when you are sure you have a query that both produces accurate results and describes them in an understandable way, you can start looking for different or clever ways to express the same logic that might perform better. But if you don't first take the step of better-defining your logic, you won't be able to validate your optimizations and you'll risk quickly producing incorrect data.
Upvotes: 2
Reputation: 1269603
Non-equality conditions -- such as OR
-- pretty much kill JOIN
performance, especially in databases such as SQL Server that do not use indexes in such cases.
I would recommend a two-join approach, but you are going to have to fix the SELECT
because it is not clear where the columns come from.
SELECT --A, B, C, D, E, F,
T1.A,
COALESCE(T2_1.B, T2_2.B) as B,
. . .
FROM TABLE1 T1 INNER JOIN
TABLE2 T2_1
ON T2.F = T1.F AND
(T1.E IS NULL OR T2_1.E IS NULL) LEFT JOIN
TABLE2 T2_2
ON T2_2.E = T1.E -- E cannot be NULL
WHERE T2_1.F IS NOT NULL OR T2_2.E IS NOT NULL; -- checks for a match for either condition
Then for performance, you want indexes on TABLE2(F, E)
and TABLE2(E)
.
Upvotes: 2
Reputation: 288
Statement OR might extremely decrease execution time. Try to get rid of it. Maybe something like this would do:
SELECT A,B,C,D,E,F
FROM TABLE1 T1
LEFT JOIN TABLE2 T2
ON T1.E = T2.E
LEFT JOIN TABLE2 T22
ON T1.F= T22.F
AND T2.E IS NULL
WHERE NOT (T2.E IS NULL AND T22.F IS NULL)
Upvotes: 0