Reputation: 48522
I have a table named Workflow. It has 37M rows in it. There is a primary key on the ID column (int) plus an additional column. The ID column is the first column in the index.
If I execute the following query, the PK is not used (unless I use an index hint)
Select Distinct(SubID) From Workflow Where ID >= @LastSeenWorkflowID
If I execute this query instead, the PK is used
Select Distinct(SubID) From Workflow Where ID >= 786400000
I suspect the problem is with using the parameter value in the query (which I have to do). I really don't want to use an index hint. Is there a workaround for this?
Upvotes: 6
Views: 172
Reputation: 11
This is a classic example of local variable producing a sub-optimal plan.
You should use OPTION (RECOMPILE) in order to compile your query with the actual parameter value of ID.
See my blog post for more information: http://www.sqlbadpractices.com/using-local-variables-in-t-sql-queries/
Upvotes: 0
Reputation: 294447
Please post the execution plan(s), as well as the exact table definition, including all indexes.
When you use a variable the optimizer does no know what selectivity the query will have, the @LastSeenWorkflowID may filter out all but very last few rows in Workflow, or it may include them all. The generated plan has to work in both situations. There is a threshold at which the range seek over the clustered index is becoming more expensive than a full scan over a non-clustered index, simply because the clustered index is so much wider (it includes every column in the leaf levels) and thus has so much more pages to iterate over. The plan generated, which considers an unknown value for @LastSeenWorkflowID, is likely crossing that threshold in estimating the cost of the clustered index seek and as such it chooses the scan over the non-clustered index.
You could provide a narrow index that is aimed specifically at this query:
CREATE INDEX WorkflowSubId ON Workflow(ID, SubId);
or:
CREATE INDEX WorkflowSubId ON Workflow(ID) INCLUDE (SubId);
Such an index is too-good-to-pass for your query, no matter the value of @LastSeenWorkflowID.
Upvotes: 3
Reputation: 24506
Assuming your PK is an identity OR is always greater than 0, perhaps you could try this:
Select Distinct(SubID)
From Workflow
Where ID >= @LastSeenWorkflowID
And ID > 0
By adding the 2nd condition, it may cause the optimizer to use an index seek.
Upvotes: 2