why using IN (or NOT IN) clause in a query makes it really slow

Question

I have a query:

SELECT DISTINCT field1 FROM table1 WHERE field2 = something

(table1 contains 1 million records, execution time:0.106sec, returns: 20 records)

Another query

SELECT DISTINCT similarField1 FROM table2 WHERE similarField2 = somethingElse

(table2 contains half million records, execution time:0.078sec, returns: 20 records)

Now if I run a query, by combining above both:

SELECT DISTINCT field1 FROM table1 WHERE field2 = something AND field1 NOT IN (SELECT DISTINCT similarField1 FROM table2 WHERE similarField2 = somethingElse)

It does't give result even running for 10mins. Why it has became dramatically slow, and what could be a potential solution.

edit: I am using MySQL with dbvisualizer 6.5

Tim Schmelter · Accepted Answer

You don't need to use DISTINCT on the sub-query. Try to use NOT EXISTS which probably is more efficient in SQL-Server:

SELECT DISTINCT field1 
FROM   table1 
WHERE  field2 = @something 
AND NOT EXISTS
(
    SELECT 1 FROM table2
    WHERE table2.similarfield1 = table1.field2 
     AND  table2.similarfield2 = @somethingelse
)

Edit: Since you have updated the tags, i'm not sure if this is more efficient in MySql. However, i'd prefer NOT EXISTS anyway since it also works with NULL values(if you use IS NULL) and is easier to read and to maintain.

why using IN (or NOT IN) clause in a query makes it really slow

Answers (2)

Related Questions