Reputation: 573
We have a very large table, where every day 1-2 million rows are being added to the table.
In this query:
SELECT jobid, exitstatus
FROM jobsData
WHERE finishtime >= {ts '2012-10-04 03:19:26'} AND task = 't1_345345_454'
GROUP BY jobid, exitstatus
Indexes exists for both Task
and FinishTime
.
We expected that the task
index will be used since it has much fewer rows. The problem that we see is that SQL Server creates a bad query execution plan which uses the FinishTime
index instead of the task, and the query takes very long time.
This happens when the finish time value is outside the FinishTime
index histogram.
Statistics are updated every day / several hours, but there are still many cases where the queries are for recent values.
The question: we can see clearly in the estimated execution plan that the estimated number of rows for the FinishTime
is 1
in this case, so the FinishTime
index is selcted. Why SQL Server assumes that this is 1
if there is no data? Is there a way to tell it to use something more reasonable?
When we replace the date with a bit earlier one, statistics exists in the histogram and the estimated number of rows is ~7000
Upvotes: 1
Views: 507
Reputation: 294407
You can use a Plan Guide to instruct the optimizer to use a specific query plan for you. This fits well for generated queries that you cannot modify to add hints.
Upvotes: 4