Craig
Craig

Reputation: 18734

Wildcard Pattern usage

I've had a SQL performance review done on a project we're working on, and one 'Critical' item that has come up is this:

This kind of wildcard query pattern will cause a table scan, resulting in poor query performance.

 SELECT *
 FROM TabFoo
 WHERE ColBar = @someparam OR @someparam IS NULL

Their recommendation is:

In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.

Dynamic SQL surely isn't the right way forward. Basically the procedure is one where I am search for something, OR something else. Two parameters come into the procedure, and I am filtering on one, or the other.

A better example than what they showed is:

SELECT ..
FROM...
WHERE (ColA = @ParA OR @ColA IS NULL)
(AND ColB = @ParB OR @ParB IS NULL)

Is that bad practice, and besides dynamic SQL (because, I thought dynamic sql can't really compile and be more efficient in it's execution plan?), how would this best be done?

Upvotes: 0

Views: 232

Answers (3)

Russ Perry Jr
Russ Perry Jr

Reputation: 33

We did a Microsoft engagement where they noted that we had a ton of this "Wildcard Pattern Usage", and their suggestion was to convert the query to an IF/ELSE structure...

    IF (@SomeParam is null) BEGIN
        SELECT *
        FROM TabFoo
    END
    ELSE BEGIN
        SELECT *
        FROM TabFoo
        WHERE ColBar = @someparam
    END

They preferred this approach over recompile (adds to execution time) or dynamic code (can't plan ahead, so kind of the same thing, having to figure out the plan every time); and I seem to recall that it is still an issue even with local variables (plus, you need extra memory regardless).

You can see that things get a bit crazy if you write queries with multiple WPU issues, but at least for the smaller ones, MS recommends the IF/ELSE approach.

In all the examples I saw, NULL was involved, but I can't help but think if you had a parameter utilizing a default, whether on the parameter itself or set with an ISNULL(), and essentially the same pattern used, that might also be bad (well, as long as the default is something an "actual value" would never be, that is).

Upvotes: 1

Nicholas Carey
Nicholas Carey

Reputation: 74385

A query like

select *
from foo
where foo.bar = @p OR @p is null

might or might not cause a table scan. My experience is that it will not: the optimizer perfectly able to do an index seek on the expression foo.bar = @p, assuming a suitable index exists. Further, it's perfectly able to short-circuit things if the variable is null. You won't know what your execution plan looks like until you try it and examine the bound execution plane. A better technique, however is this:

select *
from foo
where foo.bar = coalesce(@p,foo.bar)

which will give you the same behavior.

If you are using a stored procedure, one thing that can and will bite you in the tookus is something like this:

create dbo.spFoo

  @p varchar(32)

as

  select *
  from dbo.foo 
  where foo.bar = @p or @p = null

  return @@rowcount

The direct use of the stored procedure parameter in the where clause will cause the cached execution plan to be based on the value of @p on its first execution. That means that if the first execution of your stored procedure has an outlier value for @p, you may get a cached execution plan that performs really poorly for the 95% of "normal" executions and really well only for the oddball cases. To prevent this from occurring, you want to do this:

create dbo.spFoo

  @p varchar(32)

as

  declare @pMine varchar(32)
  set @pMine = @p

  select *
  from dbo.foo 
  where foo.bar = @pMine or @pMine = null

  return @@rowcount

That simple assignment of the parameter to a local variable makes it an expression and so the cached execution plan is not bound to the initial value of @p. Don't ask how I know this.

Further the recommendation you received:

In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.

is hogwash. Option(recompile) means that the stored procedure is recompiled on every execution. When the stored procedure is being compiled, compile-time locks on taken out on dependent object. Further, nobody else is going to be able to execute the stored procedure until the compilation is completed. This has, shall we say, negative impact on concurrency and performance. Use of option(recompile) should be a measure of last resort.

Write clean SQL and vet your execution plans using production data, or as close as you can get to it: the execution plan you get is affected by the size and shape/distribution of the data.

Upvotes: 2

CodeMonkey1313
CodeMonkey1313

Reputation: 16031

I could be wrong, but I'm pretty sure a table scan will occur no matter what if the column you have in your where clause isn't indexed. Also, you could probably get better performance by reordering your OR clauses so that if @ParA IS NULL is true, it evaluates first and would not require evaluating the value in the column. Something to remember is that the where clause is evaluated for every row that comes back from the from clause. I would not recommend dynamic SQL, and honestly, even under relatively heavy load I'd find it difficult to believe that this form of filter would cause a significant performance hit, since a table scan is required anytime the column isn't indexed.

Upvotes: 1

Related Questions