Özenç B.
Özenç B.

Reputation: 1038

How can I convert a NOT IN statement to a NOT EXISTS statement in SQL?

I heard that NOT IN should be avoided at all costs, so I am updating my queries to utilize NOT EXISTS and run EXPLAIN ANALYSE on them to see if there are any improvements but I am having a hard time wrapping my head around it. How can I convert a query like this:

SELECT id
FROM customers c
WHERE c.status NOT IN
    ('DELETED', 'ARCHIVED', 'EXPIRED'')

Upvotes: 0

Views: 323

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

You have heard the rule incompletely. This is false:

I heard that NOT IN should be avoided at all costs, . . .

This is much closer to being true:

I heard that NOT IN with a subquery should be avoided at all costs, . . .

There are two reasons for this. By far the more important has to do with the handling of NULL values. If any value returned by the subquery is NULL, then NOT IN never returns TRUE. That is, the query returns no rows (if this is the only condition).

On the other hand, NOT EXISTS does what you expect in this case, essentially ignoring NULL values in the subquery.

This is not an issue with explicit lists, because it is unlikely that you will include a NULL value in an explicit list.

The second issue is performance. Some databases will optimize NOT EXISTS with a subquery much better than NOT IN -- particularly if the appropriate indexes are available.

Upvotes: 1

Dai
Dai

Reputation: 155145

I heard that NOT IN should be avoided at all costs.

You have heard incorrectly - and you're applying something you don't understand (or something that you accept at face-value without any investigation on your part) in production. What you're professing is an example of Cargo-Cult Programming. Don't do this. Don't believe anything you read without citations and sources (this applies to everything in life, btw)

Anyway, NOT IN and NOT EXISTS are not semantically equivalent: IN (and NOT IN) when used with variadic literal values or variadic parameters is just syntactic-sugar for a repeating OR statement, whereas NOT EXISTS is a test for set-membership, which is a very different operation.

In your case, this:

WHERE c.status NOT IN ('DELETED', 'ARCHIVED', 'EXPIRED')

Is semantically identical to this:

WHERE NOT ( c.status = 'DELETED' OR c.status = 'ARCHIVED' OR c.status = 'EXPIRED' )

...such an expression cannot be simplified and I would argue it's optimal.

Ergo: no changes to this query are needed.

Upvotes: 0

Andronicus
Andronicus

Reputation: 26046

That might be true if you're using not in with nested select, for example:

select id
from customer c
where c.status_id not in (
    select id
    from status s
    where s.type = 'DELETED'
)

This should rather be rewritten to:

select id
from customer c
where not exists(
    select *
    from status s
    where (s.id, s.type) = (c.status_id, 'DELETED')
)

But since you're using not in with a set of possible values, it's ok and it would be even hard to rewrite it using exists.

Upvotes: 1

Related Questions