Reputation: 31
I am wondering why the below SQL statement behaves the way that it does:
select *
from tableA
where document_id in (select document_id
from tableB
where batch_id = 99997)
tableA
contains the column document_id
but tableB
does not, so the query appears to return all rows in tableA
, this behavior occurs if you use any field name in the select statement of the IN clause that is a field in tableA
. Using a name not in tableA
or tableB
does result in an error.
Upvotes: 3
Views: 260
Reputation: 59
I've just found this behaviour myself after many years. I use Exists() instead of the IN() clause except for ad-hoc and non data modifying queries because of unexpected results in the past with null values, but this had me flummoxed until I read DavidG's response. There is no reason to write a query in such a manner, referencing an outer column without referencing the subqueries table at all but a syntax checker would have trouble disallowing it.
Mostly if using IN() with a sub query you would be referencing a foreign key which, 95+% of the time (with our company standards), has the same name as the primary key, but those 5% of times could catch you out.
Although it seems this behavior also occurs with Exists()...
Select * from tableA Where Exists(Select document_id from tableB)
...Exists() is never written this way because it makes no sense but is written more explicitly something like...
Select * From tableA Where Exists(Select * from tableB tB where tB.TableB_Document_id = tableA.document_id)
and so less likely to occur.
(Note: I would've put this in a comment but I don't have the points)
Upvotes: 0
Reputation: 6148
This is why I'd suggest you get in the habit of putting the explicit table name for each column. It is also helpful for maintainability or later extending queries.
select *
from tableA A
where A.document_id in (select B.document_id
from tableB B
where B.batch_id = 99997)
If you qualify the tables like this, it will throw a clear error on runtime and prevent any subtle mistakes. This would give the error similar to
tableB.document_id does not exist.
Upvotes: 3
Reputation: 118937
It's not an error. In a subquery, you can still use columns form the parent. So when you say
SELECT document_id FROM tableB WHERE batch_id = 99997
You are saying for every row where batch_id
is 9997 in tableB, select document_id
from tableA
. And of course, all those document_id
values exists, so it will return all those rows.
Upvotes: 5