neves
neves

Reputation: 39173

SQL query with an IN operator stops seeking an index and starts to scan it, becomes 100x slower

I have a slow query in MS SqlServer due to a index scan. I'm trying to optimize it, but I'm stuck. The query is inside a stored procedure and is something like this:

declare @last_day_of_month datetime
set @last_day_of_month = '2014-08-31'

select value1, value2, value3
from multiple_tables
join my_big_table on predicate
where my_big_table.date IN (@last_day_of_month, @last_day_of_month+1)

the query above makes a clustered index scan in my_big_table. It takes 5 sec on average. I thought it was a case of parameter sniffing, because this version of the query makes a index seek:

select value1, value2, value3
from multiple_tables
join my_big_table on predicate
where my_big_table.date IN ('2014-08-31', '2014-09-01')

The query response is immediate. Takes less than 1 sec. What puzzles me is that the version of the query below also makes a index seek:

set @last_day_of_month = '2014-08-31'

select value1, value2, value3
from multiple_tables
join my_big_table on predicate
where my_big_table.date IN (@last_day_of_month)

The predicates below all make a index scan:

where my_big_table.date IN (@last_day_of_month, '2014-09-01')
where my_big_table.date IN (@last_day_of_month, @first_day_of_month)

An extra complication is that my_big_table isn't really a table, but a simple view with just one table. This fact forbids me to use a FORCESEEK query option.

The behavior is the same if I use an OR clause instead of an IN. It was originally an OR.

In my current situation, it is quicker to make the query twice and make a UNION ALL with the results. But it is ugly. Can some one help me?

Upvotes: 3

Views: 91

Answers (2)

neves
neves

Reputation: 39173

A simple modification made the query execute a index seek. I've changed the IN for a BETWEEN operator:

declare @last_day_of_month datetime
set @last_day_of_month = '2014-08-31'

select value1, value2, value3
from multiple_tables
join my_big_table on predicate
where my_big_table.date BETWEEN @last_day_of_month and (@last_day_of_month+1)

I hypothesize that with the IN, the query optimizer can't reason about the values relationship, and concludes that they can be anywhere in the index. Specifying the BETWEEN, I'm telling the optimizer that the second one is bigger than the first, so a Index Seek will be a good option.

Upvotes: 0

Pieter Geerkens
Pieter Geerkens

Reputation: 11883

I never use the IN operator except for constant sets. Try this instead, as the optimizer seems to understand JOINs better than INs:

select value1, value2, value3
from multiple_tables
join my_big_table on predicate
join ( values
     (@last_day_of_month)
    ,(@last_day_of_month+1)
) dates(day_of_month) on dates.day_of_month = my_big_table.date;

or

create table #dates(day_of_month datetime not null);
insert #dates(day_of_month)
    values (@last_day_of_month)
          ,(@last_day_of_month+1)
;
select value1, value2, value3
from multiple_tables
join my_big_table on predicate
join #dates on dates.day_of_month = my_big_table.date;

Upvotes: 3

Related Questions