Bendom
Bendom

Reputation: 175

Execution plan index usage

I try to optimize index usage in some of the queries I use often. Recently I have learnt that using function and implicit conversions in where clause is not optimal. To my surprise implicit conversion can be faster due to proper index usage.

I have a table called Records. Clustered index & primary key is on Id column (int) and non-clustered index on column Created (datetime). To avoid implicit conversion I created @dd variable.

declare @dd Datetime

set @dd = '2019-08-25'

--1st option
select  * from Records where Created > @dd
--2nd option
select  * from Records where Created > '2019-08-25'    
--3rd option
select  * from Records where Year(Created) = 2019 and MONTH(Created) = 8 and DAY(Created) = 25

Unfortunately 1st option uses Index scan (column Id). 2nd option uses index seek (column Created) and key lookup which is nice but I wonder why 1st option doesn't do the same. I've added 3rd option just to see the difference and it behaves as 1st option.

I've found query execution plan : missing index which leads to blog post about this behavior but it doesn't explain why it happens. I can see there is a difference in estimated number of rows. When I set date to '2019-06-25', all three option are using similar plan with index scan.

So is it a rule of thumb to use implicit conversion when I can expect number of rows will be low? I am quite confused here.

Upvotes: 1

Views: 160

Answers (1)

Dan Guzman
Dan Guzman

Reputation: 46203

Implicit conversion of datetime literals is required because T-SQL doesn't have syntax for a datetime literal. This is not a performance concern.

Implicit conversion is a concern in cases where it results in a non-sargable expression due to data type precedence (e.g. comparing an int literal with a varchar column). Applying functions like YEAR to columns also results in a non-sargable expression because the function result must be evaluated before the comparison.

The issue in your queries is that you are using a variable instead of parameter. You should get the same performance as the literal with:

select * from Records where Created > @dd OPTION(RECOMPILE);

or a parameterized query (assuming the cached plan does not already exist):

EXEC sp_executesql N'select * from Records where Created > @dd'
    , N'@dd datetime'
    , @dd = '2019-08-25';

With the literal, parameter, and OPTION(RECOMPLE) with a variable, SQL Server uses the statistics histogram (if one exists) during the initial compilation to better estimate the number of rows that may be returned. In the case of a variable, SQL Server uses average density statistics to estimate the number of rows. Different plans may be used as a result, especially with skewed data distribution.

Upvotes: 0

Related Questions