Anil Bhaskar
Anil Bhaskar

Reputation: 3978

why using IN (or NOT IN) clause in a query makes it really slow

I have a query:

SELECT DISTINCT field1 FROM table1 WHERE field2 = something

(table1 contains 1 million records, execution time:0.106sec, returns: 20 records)

Another query

SELECT DISTINCT similarField1 FROM table2 WHERE similarField2 = somethingElse

(table2 contains half million records, execution time:0.078sec, returns: 20 records)

Now if I run a query, by combining above both:

SELECT DISTINCT field1 FROM table1 WHERE field2 = something AND field1 NOT IN (SELECT DISTINCT similarField1 FROM table2 WHERE similarField2 = somethingElse)

It does't give result even running for 10mins. Why it has became dramatically slow, and what could be a potential solution.

edit: I am using MySQL with dbvisualizer 6.5

Upvotes: 0

Views: 200

Answers (2)

KumarHarsh
KumarHarsh

Reputation: 5094

my query and advice are similar to @TimSchmelter.

In fact you should not use distinct at all. First you should remove distinct and check if you are getting duplicate records you have just ask part of your problem.Table design are not clear.

You should post your complete problem and query here without any hesitant. Also don't forget to apply index on feild2, feild1,similarField1,similarField2.

SELECT DISTINCT field1
FROM table1 tbl1
WHERE field2 = something
    AND NOT EXISTS (
        SELECT similarField1
        FROM table2 tbl2
        WHERE tbl1.field1 = tbl2.similarField1
            AND similarField2 = somethingElse
        )

Upvotes: 0

Tim Schmelter
Tim Schmelter

Reputation: 460098

You don't need to use DISTINCT on the sub-query. Try to use NOT EXISTS which probably is more efficient in SQL-Server:

SELECT DISTINCT field1 
FROM   table1 
WHERE  field2 = @something 
AND NOT EXISTS
(
    SELECT 1 FROM table2
    WHERE table2.similarfield1 = table1.field2 
     AND  table2.similarfield2 = @somethingelse
)

Edit: Since you have updated the tags, i'm not sure if this is more efficient in MySql. However, i'd prefer NOT EXISTS anyway since it also works with NULL values(if you use IS NULL) and is easier to read and to maintain.

Upvotes: 1

Related Questions