Reputation: 1975
I'm puzzled why I'm getting clustered index seeks on query on calculated column that uses simple DATEADD function (that should be deterministic) over other column which is cluster indexed.
Am I able to keep the nonmaterialized calculated column (since this is for existing huge legacy data) and be able to filter by this column and have the sql hit the index? I need to filter on the calculated column (as I cannot change the queries and they need to query with proper offset).
Simplistic example definition:
-- table with calculated column
CREATE TABLE [dbo].[_test](
[Dt] [datetime] NOT NULL
) ON [PRIMARY]
GO
--clustered index
CREATE CLUSTERED INDEX [Idx] ON [dbo].[_test]
(
[Dt] ASC
) ON [PRIMARY]
GO
--simulation of legacy data
DECLARE @rdate DATE
DECLARE @startLoopID INT = 1
DECLARE @endLoopID INT = 1000
WHILE @startLoopID <= @endLoopID
BEGIN
SET @rdate = DATEADD(Hour, ABS(CHECKSUM(NEWID()) % (365 * 24) ), '2020-01-01');
SET @startLoopID = @startLoopID + 1;
INSERT INTO [_test] (Dt)
VALUES (@rdate);
END
--adding the calculated column with proper offset
alter table _test ADD [Dt_2] AS DATEADD(MINUTE, 300, Dt)
Example execution: DECLARE @EndTime DATETIME2(7) = '2020-07-10 00:00:00.000'
DECLARE @StartTime DATETIME2(7) = DATEADD(day, -12, @EndTime)
--select is performing seek
select * from _test
where
Dt > @StartTime AND Dt < @EndTime
--select is performing scan
select * from _test
where
Dt_2 > @StartTime AND Dt_2 < @EndTime
Query plan:
When filtering on Dt
column I get expected seek. When filtering on Dt_2
column - which is calculated from Dt
via deterministic function - I get index scan.
In a real scenario with large amount of data this is causing huge performance hit.
Upvotes: 0
Views: 613
Reputation: 4454
You need to index the calculated column in order for SQL to be able to perform a seek. The calculated column only gets evaluated at select-time unless it is persisted or indexed. Even if it was marked as persisted
, it would still need to be scanned without an index. The fact that the expression is deterministic and precise means it can be indexed, but you still need to add the index.
Your comment below indicates you are unable to add an index to the computed column. But in this particular case, you don't actually need one anyway in order to perform the same query, because your calculated column is adding a constant offset to your original column. Therefore, the constant expression can be moved to the right hand side of the comparison operation in the query itself, and SQL will still be able to use the original clustered index.
That is to say, instead of creating a new column which is equal to Dt
plus some constant, you can subtract the constant from the RHS of the inequality. Instead of:
alter table _test ADD [Dt_2] AS DATEADD(MINUTE, 300, Dt);
DECLARE @EndTime DATETIME2(7) = '2020-07-10 00:00:00.000'
DECLARE @StartTime DATETIME2(7) = DATEADD(day, -12, @EndTime);
select * from _test
where
Dt_2 > @StartTime AND Dt_2 < @EndTime;
You can use:
--subtract 300 minutes from the @endTime parameter instead of adding 300 minutes to every value of Dt
DECLARE @EndTime DATETIME2(7) = dateadd(minute, -300, '2020-07-10 00:00:00.000');
DECLARE @StartTime DATETIME2(7) = DATEADD(day, -12, @EndTime);
select * from _test
where
Dt > @StartTime
AND Dt < @EndTime;
Upvotes: 1