Amy B
Amy B

Reputation: 110101

My IN clause leads to a full scan of an index in T-SQL. What can I do?

I have a sql query with 50 parameters, such as this one.

DECLARE
  @p0 int, @p1 int, @p2 int, (text omitted), @p49 int

SELECT
  @p0=111227, @p1=146599, @p2=98917, (text omitted), @p49=125319

--
SELECT
  [t0].[CustomerID], [t0].[Amount],
  [t0].[OrderID], [t0].[InvoiceNumber]
FROM [dbo].[Orders] AS [t0]
WHERE ([t0].[CustomerID]) IN
  (@p0, @p1, @p2, (text omitted), @p49)

The estimated execution plan shows that the database will collect these parameters, order them, and then read the index Orders.CustomerID from the smallest parameter to the largest, then do a bookmark lookup for the rest of the record.

The problem is that there the smallest and largest parameter could be quite far apart and this will lead to reading possibly the entire index.

Since this is being done in a loop from the client side (50 params sent each time, for 1000 iterations), this is a bad situation. How can I formulate the query/client side code to get my data without repetitive index scanning while keeping the number of round trips down?


I thought about ordering the 50k parameters such that smaller readings of the index would occur. There is a wierd mitigating circumstance that prevents this - I can't use this solution. To model this circumstance, just assume that I only have 50 id's available at any time and can't control their relative position in the global list.

Upvotes: 2

Views: 1461

Answers (3)

p.campbell
p.campbell

Reputation: 100567

To build on Quassnoi's answer, if you were working with SQL 2008, you could save yourself some time by inserting all 50 items with one statement. SQL 2008 has a new feature for multiple valued inserts. e.g.

INSERT INTO @Customers (CustID)
VALUES (@p0),
       (@p1),
       <snip>
       (@p49)

Now @Customers table is populated and ready to INNER JOIN on, or your IN clause.

Upvotes: 1

Shea
Shea

Reputation: 11243

An index range scan is pretty fast. There's usually a lot less data in the index than in the table and there's a much better chance that the index is already in memory.

I can't blame you for wanting to save round trips to the server by putting each of the IDs your looking for in a bundle. If the index RANGE scan really worries you, you can create a parameterized server side cursor (e.g., in TSQL) that takes the CustomerID as a parameter. Stop as soon as you find a match. That query should definitely use an index unique scan instead of a range scan.

Upvotes: 1

Quassnoi
Quassnoi

Reputation: 425341

Insert the parameters into a temporary table, then join it with your table:

DECLARE @params AS TABLE(param INT);

INSERT
INTO    @params
VALUES  (@p1)
...
INSERT
INTO    @params
VALUES  (@p49)

SELECT
  [t0].[CustomerID], [t0].[Amount],
  [t0].[OrderID], [t0].[InvoiceNumber]
FROM @params, [dbo].[Orders] AS [t0]
WHERE ([t0].[CustomerID]) = @params.param

This will most probably use NESTED LOOPS with a INDEX SEEK over CustomerID on each loop.

Upvotes: 7

Related Questions