SELECT clause column causes index not being used

Question

I tried to index a table today and noticed something awkward and couldn't understand the reason (even if it seems so simple?). I have a table like this:

CREATE TABLE [X].[A](
    [TableName] [varchar](250) NOT NULL,
    [Header] [varchar](4000) NOT NULL,
    [Trailer] [varchar](4000) NOT NULL,
    [FileName] [varchar](50) NOT NULL,
    [StartDate] [date] NOT NULL,
    [EndDate] [date] NOT NULL
) ON [PRIMARY]

Let's assume that I have a query like this (I simplified it a bit):

SELECT 
    Header
FROM
    X.A ht (NOLOCK)
WHERE
    ht.TableName = 'asd' AND
    '2017-05-29' BETWEEN ht.StartDate AND ht.EndDate

I wanted to put an index to (TableName, StartDate, EndDate) since they are in the where clause. My index is below:

CREATE NONCLUSTERED INDEX [NonClusteredIndex-20170530-113847] ON [X].[A]
(
    [TableName] ASC,
    [StartDate] ASC,
    [EndDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

But I noticed that my query was not using the index and it does a Table Scan, but when I change the SELECT clause like the following:

SELECT 
    TableName
FROM
    X.A ht (NOLOCK)
WHERE
    ht.TableName = 'asd' AND
    '2017-05-29' BETWEEN ht.StartDate AND ht.EndDate

It does use my index and does Index Seek which is what I want. What's going on?

GarethD · Accepted Answer

In a really simplified example, if your table had data like this:

ID | TableName | Header | Trailer | FileName | StartDate  | EndDate
---+-----------+--------+---------+----------+------------+------------
 1 |    asd    |   H1   |    T1   |    F1    | 2017-05-28 | 2017-05-30
 2 |    asd    |   H2   |    T2   |    F2    | 2017-05-28 | 2017-05-30
 3 |    asd    |   H3   |    T3   |    F3    | 2017-05-28 | 2017-05-30
 4 |    asd    |   H4   |    T4   |    F4    | 2017-05-28 | 2017-05-30
 5 |    asd    |   H5   |    T5   |    F5    | 2017-05-28 | 2017-05-30
 6 |    asd    |   H6   |    T6   |    F6    | 2017-05-28 | 2017-05-30
 7 |    asd    |   H7   |    T7   |    F7    | 2017-05-28 | 2017-05-30
 8 |    xyz    |   H1   |    T1   |    F1    | 2017-05-28 | 2017-05-30

Then your index would look something like this

TableName | StartDate  | EndDate    | ID | 
----------+------------+------------+----+-
   asd    | 2017-05-28 | 2017-05-30 |  1 | 
   asd    | 2017-05-28 | 2017-05-30 |  2 | 
   asd    | 2017-05-28 | 2017-05-30 |  3 | 
   asd    | 2017-05-28 | 2017-05-30 |  4 | 
   asd    | 2017-05-28 | 2017-05-30 |  5 | 
   asd    | 2017-05-28 | 2017-05-30 |  6 | 
   asd    | 2017-05-28 | 2017-05-30 |  7 | 
   xyz    | 2017-05-28 | 2017-05-30 |  8 |

In this case you can get back all the columns required for this query:

SELECT 
    TableName
FROM
    X.A ht (NOLOCK)
WHERE
    ht.TableName = 'asd' AND
    '2017-05-29' BETWEEN ht.StartDate AND ht.EndDate

But, if you want to select Header then you can't just use the index (since Header) is not stored in the index, for each row you have to find the clustering key (which in this simple example is ID), then lookup the corresponding row in the main table to get the value of Header. This (Index Seek + Key Lookup) is an expensive operation, and since you have to do this for 9 rows, and the original table only includes 10, it is probably more efficient to simple search the main table in the first place and not use the index (Clustered Index Scan).

Had you been looking up a very small number of records, like xyd, then it is likely that an index seek along with key lookup would have been chosen. I don't think there is a simple cut off for how many records are needed to switch between the two methods, but in an answer to another question I found a cut off value of 3.7% of records, on a fairly small table. I expect as the table size increases this percentage decreases.

SELECT clause column causes index not being used

Answers (2)

Related Questions