AngryHacker
AngryHacker

Reputation: 61646

Why doesn't this query use the proper index?

Table definition:

CREATE TABLE [dbo].[AllErrors](
  [ID] [int] IDENTITY(1,1) NOT NULL,
  [DomainLogin] [nvarchar](50) NULL,
  [ExceptionDate] [datetime] NULL,
  [ExceptionDescr] [nvarchar](max) NULL,
  [MarketName] [nvarchar](50) NULL,
  [Version] [nvarchar](50) NULL,
  CONSTRAINT [PK_AllErrors] PRIMARY KEY CLUSTERED ([ID] ASC)
)

-- Add an index on the date
CREATE NONCLUSTERED INDEX [IX_ExceptionDate] ON [dbo].[AllErrors] ([ExceptionDate] ASC)

I run this query:

declare @yesterday datetime
select @yesterday = getdate() - 1

SELECT * INTO #yst
from AllErrors 
where ExceptionDate between @yesterday and @yesterday + 1

enter image description here

This code does not use my IX_ExceptionDate (as gleaned from the execution plan). It does a clustered scan on the primary key index. However, the code below does use the IX_ExceptionDate index:

SELECT * INTO #yst
from AllErrors 
where ExceptionDate between @yesterday and @yesterday + 1
  AND ExceptionDate = ExceptionDate

enter image description here

Why is this?

EDIT: Visual Execution Plan added.

EDIT: Textual Execution Plans below.

Query 1:

|--Table Insert(OBJECT:([#yst]), SET:([#yst].[ID] = [Expr1006],[#yst].[DomainLogin] = [MarketStats].[dbo].[AllErrors].[DomainLogin],[#yst].[ExceptionDate] = [MarketStats].[dbo].[AllErrors].[ExceptionDate],[#yst].[ExceptionDescr] = [MarketStats].[dbo].[AllErrors].[ExceptionDescr],[#yst].[MarketName] = [MarketStats].[dbo].[AllErrors].[MarketName],[#yst].[Version] = [MarketStats].[dbo].[AllErrors].[Version])) |--Top(ROWCOUNT est 0) |--Compute Scalar(DEFINE:([Expr1006]=setidentity([MarketStats].[dbo].[AllErrors].[ID],(-7),(0),N'#yst'))) |--Clustered Index Scan(OBJECT:([MarketStats].[dbo].[AllErrors].[PK_AllErrors]), WHERE:([MarketStats].[dbo].[AllErrors].[ExceptionDate]>=[@yesterday] AND [MarketStats].[dbo].[AllErrors].[ExceptionDate]<=[@yesterday]+'1900-01-02 00:00:00.000'))

Query 2:

|--Table Insert(OBJECT:([#yst]), SET:([#yst].[ID] = [Expr1006],[#yst].[DomainLogin] = [MarketStats].[dbo].[AllErrors].[DomainLogin],[#yst].[ExceptionDate] = [MarketStats].[dbo].[AllErrors].[ExceptionDate],[#yst].[ExceptionDescr] = [MarketStats].[dbo].[AllErrors].[ExceptionDescr],[#yst].[MarketName] = [MarketStats].[dbo].[AllErrors].[MarketName],[#yst].[Version] = [MarketStats].[dbo].[AllErrors].[Version])) |--Top(ROWCOUNT est 0) |--Compute Scalar(DEFINE:([Expr1006]=setidentity([MarketStats].[dbo].[AllErrors].[ID],(-7),(0),N'#yst'))) |--Nested Loops(Inner Join, OUTER REFERENCES:([MarketStats].[dbo].[AllErrors].[ID], [Expr1008]) OPTIMIZED WITH UNORDERED PREFETCH) |--Index Seek(OBJECT:([MarketStats].[dbo].[AllErrors].[IX_ExceptionDate]), SEEK:([MarketStats].[dbo].[AllErrors].[ExceptionDate] >= [@yesterday] AND [MarketStats].[dbo].[AllErrors].[ExceptionDate] <= [@yesterday]+'1900-01-02 00:00:00.000'), WHERE:([MarketStats].[dbo].[AllErrors].[ExceptionDate]=[MarketStats].[dbo].[AllErrors].[ExceptionDate]) ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([MarketStats].[dbo].[AllErrors].[PK_AllErrors]), SEEK:([MarketStats].[dbo].[AllErrors].[ID]=[MarketStats].[dbo].[AllErrors].[ID]) LOOKUP ORDERED FORWARD)

Upvotes: 3

Views: 511

Answers (2)

Martin Smith
Martin Smith

Reputation: 453910

It doesn't know what the value of the variables will be when it compiles the query. You could try OPTION (RECOMPILE).

I presume that the addition of the AND clause in the query (even though logically it makes it no more selective at all) must mislead the optimiser into estimating the query with greater selectivity thus giving you the plan that you wanted!

You say in the comments that the version without the ExceptionDate = ExceptionDate is estimated at 88234.8 rows and the version with 8823.48

Generally in the absence of usable statistics SQL Server falls back to heuristics dependant upon the type of comparison operator in the predicate.

It assumes that a > predicate will return 30% of the rows for example and that an = predicate will return 10% of the rows so it looks like it is just applying that directly to the result of the first estimate. Interesting that it does not take account of the fact that the equals is against the column itself here!

c.f. Best Practices for Managing Statistics - Avoid use of local variables in queries

Upvotes: 6

SRoderick
SRoderick

Reputation: 61

Short answer: Because of the "SELECT *", your query hits the clustered index: the Key Lookup operation is much more costly than a clustered index scan.

See the differing query plans resulting from

declare @yesterday datetime
select @yesterday = getdate() - 1

SELECT * INTO dbo.#yst
from AllErrors WITH (INDEX = IX_ExceptionDate)
where ExceptionDate between @yesterday and @yesterday + 1

AND

declare @yesterday datetime
select @yesterday = getdate() - 1

SELECT * INTO dbo.#yst
from AllErrors
where ExceptionDate between @yesterday and @yesterday + 1

AND

declare @yesterday datetime
select @yesterday = getdate() - 1

SELECT ExceptionDate INTO dbo.#yst
from AllErrors
where ExceptionDate between @yesterday and @yesterday + 1

Upvotes: 5

Related Questions