Reputation: 18734
I've had a SQL performance review done on a project we're working on, and one 'Critical' item that has come up is this:
This kind of wildcard query pattern will cause a table scan, resulting in poor query performance.
SELECT *
FROM TabFoo
WHERE ColBar = @someparam OR @someparam IS NULL
Their recommendation is:
In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.
Dynamic SQL surely isn't the right way forward. Basically the procedure is one where I am search for something, OR something else. Two parameters come into the procedure, and I am filtering on one, or the other.
A better example than what they showed is:
SELECT ..
FROM...
WHERE (ColA = @ParA OR @ColA IS NULL)
(AND ColB = @ParB OR @ParB IS NULL)
Is that bad practice, and besides dynamic SQL (because, I thought dynamic sql can't really compile and be more efficient in it's execution plan?), how would this best be done?
Upvotes: 0
Views: 232
Reputation: 33
We did a Microsoft engagement where they noted that we had a ton of this "Wildcard Pattern Usage", and their suggestion was to convert the query to an IF/ELSE structure...
IF (@SomeParam is null) BEGIN
SELECT *
FROM TabFoo
END
ELSE BEGIN
SELECT *
FROM TabFoo
WHERE ColBar = @someparam
END
They preferred this approach over recompile (adds to execution time) or dynamic code (can't plan ahead, so kind of the same thing, having to figure out the plan every time); and I seem to recall that it is still an issue even with local variables (plus, you need extra memory regardless).
You can see that things get a bit crazy if you write queries with multiple WPU issues, but at least for the smaller ones, MS recommends the IF/ELSE approach.
In all the examples I saw, NULL was involved, but I can't help but think if you had a parameter utilizing a default, whether on the parameter itself or set with an ISNULL(), and essentially the same pattern used, that might also be bad (well, as long as the default is something an "actual value" would never be, that is).
Upvotes: 1
Reputation: 74385
A query like
select *
from foo
where foo.bar = @p OR @p is null
might or might not cause a table scan. My experience is that it will not: the optimizer perfectly able to do an index seek on the expression foo.bar = @p
, assuming a suitable index exists. Further, it's perfectly able to short-circuit things if the variable is null. You won't know what your execution plan looks like until you try it and examine the bound execution plane. A better technique, however is this:
select *
from foo
where foo.bar = coalesce(@p,foo.bar)
which will give you the same behavior.
If you are using a stored procedure, one thing that can and will bite you in the tookus is something like this:
create dbo.spFoo
@p varchar(32)
as
select *
from dbo.foo
where foo.bar = @p or @p = null
return @@rowcount
The direct use of the stored procedure parameter in the where clause will cause the cached execution plan to be based on the value of @p
on its first execution. That means that if the first execution of your stored procedure has an outlier value for @p
, you may get a cached execution plan that performs really poorly for the 95% of "normal" executions and really well only for the oddball cases. To prevent this from occurring, you want to do this:
create dbo.spFoo
@p varchar(32)
as
declare @pMine varchar(32)
set @pMine = @p
select *
from dbo.foo
where foo.bar = @pMine or @pMine = null
return @@rowcount
That simple assignment of the parameter to a local variable makes it an expression and so the cached execution plan is not bound to the initial value of @p
. Don't ask how I know this.
Further the recommendation you received:
In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.
is hogwash. Option(recompile)
means that the stored procedure is recompiled on every execution. When the stored procedure is being compiled, compile-time locks on taken out on dependent object. Further, nobody else is going to be able to execute the stored procedure until the compilation is completed. This has, shall we say, negative impact on concurrency and performance. Use of option(recompile)
should be a measure of last resort.
Write clean SQL and vet your execution plans using production data, or as close as you can get to it: the execution plan you get is affected by the size and shape/distribution of the data.
Upvotes: 2
Reputation: 16031
I could be wrong, but I'm pretty sure a table scan will occur no matter what if the column you have in your where
clause isn't indexed. Also, you could probably get better performance by reordering your OR
clauses so that if @ParA IS NULL
is true, it evaluates first and would not require evaluating the value in the column. Something to remember is that the where
clause is evaluated for every row that comes back from the from
clause. I would not recommend dynamic SQL, and honestly, even under relatively heavy load I'd find it difficult to believe that this form of filter would cause a significant performance hit, since a table scan is required anytime the column isn't indexed.
Upvotes: 1