Reputation: 521
I have a query that includes this:
... AND Record.RecordID IN (1,2,3,10,11,12,13,16,17,18,26,27,28,557,31,32,33,36,37,93) AND ...
The problem seems to be that if there are 20 items or more in that list, the query takes over 25 seconds to execute. If there are less than 20, it executes immediately. Any ideas on how to optimize?
Upvotes: 4
Views: 392
Reputation: 47
It seems that when you add the 20th item, the optimizer generates a different execution plan. The execution plans are built based on statistics. The search criteria affects the expected result rows. As you add more items in the criteria list, the expected returned rows change and the optimizer might generate a new execution plan.
Check the execution plan (CTRL-L) of both queries. It's the only way to learn why it takes more time when you have more than 20 items in the list.
Before examining the execution plans, update the statistics of your table:
UPDATE STATISTICS records
if you can wait
UPDATE STATISTICS records WITH FULLSCAN
The second one will take more time but you are going to have more consistent statistics.
Upvotes: 0
Reputation: 294407
The 20th item happen to tip the balance of this particular query cost estimation from one plan to the other. with 20 items you are probably getting a full table scan. IN is just syntactic sugar for OR .. OR ... OR ... OR. And OR is the enemy of good query plans. Use a join, as Andomar suggested.
Update
If you get out from the IN syntax, you can also use a query plan hint and make sure the query stays on the optimal plan. IN syntax forces you to change the query with each execution, so you cannot use a query plan hint.
Upvotes: 2
Reputation: 332691
A CLR table valued function would be another way to create a table based on the provided parameters - for more info, see SQL Server 2005: CLR Table-Valued Functions
Upvotes: 0
Reputation: 20446
for MySQL the manual says "The number of values in the IN list is only limited by the max_allowed_packet value. " It does seem unlikely that this is the issue, but it's a place to look.
In any event, storing your IN()
values in a temp table and joining your query to it should get round the whole problem.
Upvotes: 0
Reputation: 238246
Place the RecordID's in a temporary table, and use an inner join
to filter on them. For SQL Server, this looks like:
declare @RecordIds table (int RecordID)
insert into @RecordIds values (1)
insert into @RecordIds values (2)
...
insert into @RecordIds values (93)
select r.*
from Records r
inner join @RecordIds ri on ri.RecordID = r.RecordID
Upvotes: 9
Reputation: 55468
One thing to do would be to look at the optimizer plan (if you can) and see how the plan differs when you use 20 items or less vs. > 20. In Oracle, for example, you can use an explain plan command to see this output.
Here's some info on how to use explain plan in Oracle: http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/ex_plan.htm
Other things to consider are whether or not you have an index on RecordID. It may be that once you cross a certain threshold (> 20 items in your case) the optimizer decides it's better to use a full table scan vs. using your index.
Sometimes with some databases you can use optimizer hints to persuade the optimizer to use an index if that indeed results in better performance.
Here's a link to optimizer hints you can read: http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/hintsref.htm
My answer is Oracle-centric, but the same principles should apply to most any database.
Upvotes: 9
Reputation: 31650
It seems dirty and unnecessary, but have you tried:
(Record.RecordID IN (--19 items--) OR Record.RecordID = 20th_item) AND
I don't know why adding the 20th item to the IN
group would push it over the edge.
Upvotes: 0