Reputation: 2517
I have some heavy query to make. I can perform it in one of either two ways:
Combined and complicated WHERE
clause:
SELECT *
FROM Talks d
WHERE (d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 0 AND IsY = 0)
AND DATEDIFF(DAY,d.Started,GETDATE()) < 3
AND d.Status <= 100)
OR
(d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 1 AND IsY = 0)
AND DATEDIFF(DAY,d.Started,GETDATE()) < 3
AND d.Status <= 110)
ORDER BY d.UserId;
Two different SELECTS
with UNION
:
SELECT *
FROM Talks d
WHERE (d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 0 AND IsY = 0)
AND DATEDIFF(DAY,d.Started,GETDATE()) < 3
AND d.Status <= 100)
UNION
SELECT *
FROM Talks d
(d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 1 AND IsY = 0)
AND DATEDIFF(DAY,d.Started,GETDATE()) < 3
AND d.Status <= 110)
I have three questions:
ORDER BY
) the unified result (like in the first option).PS. all tables are pretty big (> 10M records in each) and this query should be used extensively by some website.
Upvotes: 0
Views: 110
Reputation: 11018
In many cases, SQL Server is terrible at optimizing OR
, so it is very well possible you are better off with UNION
. But the only way to be sure, is to test both queries against a representatively big database. Use SQL Profiler to compare the metrics.
An alternative is to refactor the WHERE clause of approach 1; try to move the OR
deeper into the expression tree, or if possible, totally eliminate it. Example:
SELECT *
FROM Talks d
INNER JOIN Network n ON n.SomeGuyUserId = d.UserId
WHERE n.MainUserId = @ViewerUserId
AND n.IsX IN (0, 1)
AND n.IsY = 0
AND DATEDIFF(DAY, d.Started, GETDATE()) < 3
AND d.Status <= CASE n.IsX WHEN 0 THEN 100 WHEN 1 THEN 110 END
ORDER BY d.UserId;
Please note this query may result in duplicate records because I replaced the IN <subquery>
with a join. This happens when duplicate combinations of MainUserId
and SomeGuyUserId
exist in table Network
. If possible, use a unique constraint to avoid that situation.
Upvotes: 1
Reputation: 1815
Basically your first sql-query is not very complex. Compared to all the queries i've seen over the years it is in fact a simple one... i would just extract the DATEDIFF and go with that ...
SELECT *
FROM Talks d
WHERE DATEDIFF(DAY,d.Started,GETDATE()) < 3
and ( (d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 0 AND IsY = 0) AND d.Status <= 100)
OR
(d.UserId IN (SELECT SomeGuyUserId FROM Network WHERE MainUserId = @ViewerUserId AND IsX = 1 AND IsY = 0) AND d.Status <= 110)
)
ORDER BY d.UserId;
Upvotes: 0
Reputation: 36977
You have to look at the excution plans of both queries to decide that. Two different things could happen:
The first query results one full table scan on Talks
, while the
second query results in two full table scans (one for each part of the union). In that case, the first
query is probably faster.
The first query results one full table scan on Talks
, while the
second query results in two index range scans. In that case, the
second query is probably faster.
Upvotes: 0