Reputation: 191
I have two similar tables,
CREATE TABLE [dbo].[StockPrices] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[CompanyId] INT NOT NULL,
[Date] DATETIME NOT NULL,
[Open] DECIMAL (18, 2) NOT NULL,
[Close] DECIMAL (18, 2) NOT NULL,
[Low] DECIMAL (18, 2) NOT NULL,
[High] DECIMAL (18, 2) NOT NULL,
[Volume] INT NOT NULL,
CONSTRAINT [PK_dbo.StockPrices] PRIMARY KEY NONCLUSTERED ([Id] ASC),
CONSTRAINT [FK_dbo.StockPrices_dbo.Companies_CompanyId] FOREIGN KEY ([CompanyId]) REFERENCES [dbo].[Companies] ([Id]) ON DELETE CASCADE
);
GO
CREATE CLUSTERED INDEX [IX_CompanyId] ON [dbo].[StockPrices]([CompanyId] ASC);
GO
CREATE NONCLUSTERED INDEX [IX_Date] ON [dbo].[StockPrices]([Date] ASC);
and
CREATE TABLE [dbo].[News] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[NewsProviderId] INT NOT NULL,
[CompanyId] INT NOT NULL,
[Date] DATETIME NOT NULL,
[Title] NVARCHAR (128) NOT NULL,
[Description] NVARCHAR (256) NOT NULL,
[Url] NVARCHAR (256) NOT NULL,
CONSTRAINT [PK_dbo.News] PRIMARY KEY NONCLUSTERED ([Id] ASC),
CONSTRAINT [FK_dbo.News_dbo.Companies_CompanyId] FOREIGN KEY ([CompanyId]) REFERENCES [dbo].[Companies] ([Id]) ON DELETE CASCADE
);
GO
CREATE CLUSTERED INDEX [IX_CompanyId] ON [dbo].[News]([CompanyId] ASC);
GO
CREATE NONCLUSTERED INDEX [IX_Date] ON [dbo].[News]([Date] ASC);
GO
and two similar queries
select *
from news
where companyid = 1
and date >= '01/01/2010'
and date <= '01/31/2010'
order by date;
select *
from stockprices
where companyid = 1
and date >= '01/01/2010'
and date <= '01/31/2010'
order by date;
and I am getting two completely different actual execution plans
Query1: Relative to the batch: 86%
SELECT (COST 0%) <- Nested Loops (Inneer Join)(Cost 0%) <- Index Seek (NonClustered) [News].[IX_Date](Cost 1%) <- Key Lookup (Clustered) [News].[IX_CompanyId](Cost 99%)
Query 2: Relative to the batch: 14%
SELECT (Cost0%) <- Sort (Cost 33%) <- Clustered Index Scan (Clustered) [StockPrices]IX_CompanyId
I am not sure why? Can you advice something?
Upvotes: 2
Views: 128
Reputation: 452988
The first one is using a seek on a non covering clustered index in date order and a key lookup to get the remaining columns for rows matching companyid = 1
.
The second one is doing a scan on a covering index and then sorting the filtered result.
This is a cost based decision depending on what proportion of the table is estimated to match and the width of the two indexes (some example calculations here).
Key lookups are expensive as each one requires performing a clustered index seek to locate the page and row concerned. This means multiple pages must be read for each row found by the nonclustered index seek (as many pages as the depth of the clustered index). Moreover the clustered index page found for one row may very likely not be correlated to the page for the next row, entailing lots of random IO.
As a result the tipping point before the plan switches to an index scan can be a very low proportion of the table. The fact that the non covering index can avoid a sort in this case may make the tipping point a bit higher than otherwise.
Look at the estimated rows from each. Also consider that the News
table contains various string columns and likely fits less rows on a clustered index page than the numeric values in the StockPrices
table - so a full clustered index scan on News
may well be relatively more expensive and cause a higher tipping point.
Upvotes: 3