ledneb
ledneb

Reputation: 1431

WHERE x IN (5) vs WHERE x = 5 ... why use IN?

Something I've noticed recently is people using IN where it seems kind of inappropriate. I'm feeling like I'm missing a trick - perhaps there are speed gains or something?

This query:

SELECT * FROM pages WHERE is_visible IN ($visibility)

Why use that? Why not:

SELECT * FROM pages WHERE is_visible = $visibility

Sure, if it were this:

SELECT * FROM pages WHERE is_visible 
IN ($visibility,$invisibility,$somethingelse)

Then WHERE x IN (1,2,3) is a better alternative to WHERE x = 1 OR x = 2 OR x = 3, right? But surely this is the only scenario where you should use IN?

Anyway thanks for your time - I look forward to your replies!

Upvotes: 6

Views: 2437

Answers (9)

spencer7593
spencer7593

Reputation: 108480

No, it's not a trick. The two statements:

SELECT * FROM pages WHERE is_visible IN ($visibility)
SELECT * FROM pages WHERE is_visible = $visibility

are nearly equivalent. We observe that the two statements are equivalent in the trivial case, for example, when $visibility is a scalar with a value of 1.

But the statements are not equivalent in the non-trivial cases when $visibility contains something else. We can observe a significant difference in behavior of the two forms. Consider what happens with each form when $visibility is a string containing these example values:

    '1,2,3' 
    '1 OR 1=1'
    'select v.val from vals v'

We observe a significant difference in the resultant SQL statements generated from the two forms:

    SELECT * FROM pages WHERE is_visible IN (1,2,3)
    SELECT * FROM pages WHERE is_visible = 1,2,3
    SELECT * FROM pages WHERE is_visible IN (1 OR 1=1 )
    SELECT * FROM pages WHERE is_visible = 1 OR 1=1

A large concern here, with either form of the statement, is the potential for SQL injection. If $visibility is intended to be a scalar value, then using a bind variable in the statement is a more secure approach, since it avoids anyone from sliding 'extra' SQL syntax into the statement. (Of course, using bind variables doesn't prevent all SQL injection, but it is suitable approach to closing one hole. Using a bind variable will also improve scalability, at least on some DBMS platforms such as Oracle.)

Consider what happens when we use a bind variable (placeholder), which we know will NOT be interpreted as SQL syntax. We observe that the two statements ARE indeed equivalent:

    SELECT * FROM pages WHERE is_visible IN ( ? )
    SELECT * FROM pages WHERE is_visible = ? 

for any value supplied for the bind variable.

HTH

Upvotes: 1

ErikE
ErikE

Reputation: 50271

Then WHERE x IN (1,2,3) is a better alternative to WHERE x = 1 OR x = 2 OR x = 3, right? But surely this is the only scenario where you should use IN?

IN () and OR are syntactically equivalent. Examining the execution plan for your two suggestions will show this. IN() simply is a more efficient and easy-to-understand notation.

Upvotes: 0

Kip
Kip

Reputation: 109433

There is no trick here to boost performance, but I don't think there is any performance hit either. So both ways are OK. I agree with you that using an IN clause with only one parameter looks funny, but it is still easy enough to follow.

I think this probably happens because a developer thinks it is likely that multiple values are allowed in the future (this way the query wouldn't be changed). Or possibly the developer is used to scenarios where multiple values are possible, and they just wrote it that way out of habit.

Upvotes: 2

Otávio Décio
Otávio Décio

Reputation: 74290

Then WHERE x IN (1,2,3) is a better alternative to WHERE x = 1 OR x = 2 OR x = 3, right? But surely this is the only scenario where you should use IN?

No, you can use IN with a subquery as well.

... where field in (select field from another_table)

Upvotes: 8

Lee
Lee

Reputation: 20984

I think more to the point in understand what IN does.

If you do

SELECT * FROM pages WHERE is_visible = $visibility

In the above $visibility would have to be a var so is_visibile = 1, So your SQL collects all where is_visible = 1

Where

SELECT * FROM pages WHERE is_visible IN ($visibility)

The $visibility would be an array of data like @Ionut G. Stan has illustrated.

So your call would look like is_visible IN ('1', '2', '3')

So now your SQL will be collecting all 1,2 & 3 rows.

Hope that makes sense. Its a big difference.

Upvotes: 3

Yishai
Yishai

Reputation: 91921

In acts on any kind of set operation, whereas = is on a single value. So you can use in for multiple records on another table, or any other kind of data structure that represents multiple values.

Upvotes: 2

gbn
gbn

Reputation: 432521

"WHERE x IN (1,2,3)" is the same as "WHERE x = 1 OR x = 2 OR x = 3" anyway.

Upvotes: 0

Thomas Lötzer
Thomas Lötzer

Reputation: 25401

Maybe it is unknown wether $visibility refers to a single value or multiple values? Doesn't quite fit your example, but I have seen such use in other places.

Upvotes: 9

Ionuț G. Stan
Ionuț G. Stan

Reputation: 179179

Maybe $visibility is dynamically generated, like this:

$visibility = implode(', ', array_map('intval', array(1, 2, 3)));

Where array(1, 2, 3) may come from untrusted sources.

Upvotes: 5

Related Questions