AndyZ
AndyZ

Reputation: 615

SQL Server does not choose to use index although everything seems to suggest it

Something is wrong here and I don't understand what. It's worth to mention, there searched value is not in the table, for an existing value there is no problem. Though, why does the first query require a clustered key search for the primary key which is not even used in the query, while the second can run on the index directly. Forcing the query to use the index WITH(INDEX(indexname)) does work, but why does the optimizer not choose to use it by itself.

The column PIECE_NUM is not in any other index and is also not the primary key.

SET STATISTICS IO ON

    DECLARE  @vchEventNum   VARCHAR(50)
    SET  @vchEventNum = '54235DDS28KC1F5SJQMWZ'

    SELECT TOP 1 
            fwt.WEIGHT,
            fwt.TEST_RESULT
    FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
    WHERE fwt.PIECE_NUM LIKE @vchEventNum  + '%'  
    ORDER BY fwt.DTTM_INSERT DESC


    SELECT TOP 1 
            fwt.WEIGHT,
            fwt.TEST_RESULT
    FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
    WHERE fwt.PIECE_NUM LIKE '54235DDS28KC1F5SJQMWZ'  + '%'  
    ORDER BY fwt.DTTM_INSERT DESC

SET STATISTICS IO OFF

I let both queries run in one batch:

IO statistics report:

Query 1: logical reads 16244910

Query 2: logical reads 5

Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 16244910, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FIN_WEIGHT_TESTS'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The table has a non-clustered index on PIECE_NUM INCLUDING all three other columns from the query.

Here are the query execution plans(with a little editing to remove the actual names): enter image description here

I noticed the convert_implicit, but that is just due to conversion of the varchar parameter to nvarchar column. Changing the parameter type did not change the behaviour of the query.

Why does the query with the parameter not use the index while replacing the parameter with its value does?

Upvotes: 1

Views: 242

Answers (2)

sepupic
sepupic

Reputation: 8687

I think the cause of what happened is both use of @variable and ORDER BY in your query. To test my guess remove order by from your query and it may lead to equal plans in both cases (this time with different estimated number of rows reported in select).

As mentioned in previous answer, local variables cannot be sniffed at compilation time as the batch is seen as a whole thing, and only recompile option permits to server to know the value of a variable at the compilation time as recompilation begins when the variable is already assigned. This leads to "estimate for unknown" in the first case i.e. statistics cannot be used as we don't know the value in the filter, more rows in output are estimated.

But the query has top + order by in it. This means that if we expect many rows, to get only one but the first ordered by DTTM_INSERT DESC we must sort all the filtered rows. In fact if you look at the second plan you see that the SORT operator costs most of all. But when you use the constant, SQL Server uses the statistics and finds out that there will be only one row returned, so it can permit to sort the result.

In case of many rows expected it decides to use the index that is already ordered by DTTM_INSERT. It's only my guess because you did not post here the creation scripts for your indexes but from the plan I see that the first plan surely goes to the clustered index to grab the fields that are missing in non-clustered index, this means it's not the same non clustered that is used in the second case, but I'm sure the index chosen in the first case has THE LEADING KEY COLUMN DTTM_INSERT. Doing so server eliminates the sort that we see in the second plan

Upvotes: 1

JMabee
JMabee

Reputation: 2300

The first query is going to scan because you are using a local variable. The optimizer sees this as an "anonymous" value and therefore cannot use statistics to build a good query plan.

The second query seeks because it is a literal value and SQL can look into it's stats and knows much better how many estimated rows it will find with that value.

If you run your first query as follows I would imagine you will see it use the better plan:

DECLARE  @vchEventNum   VARCHAR(50)
SET  @vchEventNum = '54235DDS28KC1F5SJQMWZ'

SELECT TOP 1 
        fwt.WEIGHT,
        fwt.TEST_RESULT
FROM FIN_WEIGHT_TESTS fwt WITH(NOLOCK)
WHERE fwt.PIECE_NUM LIKE @vchEventNum  + '%'
ORDER BY fwt.DTTM_INSERT DESC
OPTION(RECOMPILE)

I would suggest using a parameterized procedure to run this code to ensure that it uses a cached plan. Using the RECOMPILE hint has it's own drawbacks as the optimizer will need to rebuild the plan every time it runs. So if you run this code very often I would avoid this hint.

You can read about local variables here:https://www.brentozar.com/archive/2014/06/tuning-stored-procedures-local-variables-problems/

Upvotes: 2

Related Questions