Reputation: 385
I have a question about correlated subqueries that I don't understand. I know that a correlated subquery runs for each row of the outer table. Example:
Select departmentID, productid, unit_price
From products a
Where unit_price=(
Select MIN(unit_price)
from products b
Where a.departmentID= b.departmentID
)
Order by a.departmentID
If it runs for each row in the outer query, then if the outer query has 3 rows where the departmentID = 1, does the subquery run 3 times for departmentID =1, even though it has calculated the MIN(unit_price) once for departmentID=1?.
IMO it would be stupid to run the subquery more than once for departmentID =1
Anyone care to elaborate?
EDIT: Second example: Outer: Select departmentID, productid, unit_price From products a Where unit_price=10 and departmentID = 1 Inner: SELECT MIN(unitprice) FROM Production.Products AS P2 WHERE P2.categoryid = 1
Is above how the correlated subquery works?
Upvotes: 0
Views: 1829
Reputation: 1592
Your Query should be -
Select departmentID, productid, unit_price
From products a
Where unit_price = ( -- NOTE - 'unit_price = ' (the condition)
Select MIN(unit_price)
from products b
Where a.departmentID= b.departmentID
)
Order by a.departmentID
Here if the table products
contains more than one rows for departmentID = 1
then sub-query is run for each of the time when departmentID = 1
. This is procedural way of thinking and is wrong in case of SQL because SQL is a declarative language. You should not think iteratively for each records.
However, note that the previous sub queries results are not stored anywhere.
As others suggested you should not think of the it this way, as Query Planner takes care of how efficiently the results could be fetched based on your query.
Hope this helps.
Upvotes: 0
Reputation: 146603
You misunderstand. The statement that "...correlated subquery runs for each row of the outer table." is not to be taken literally. The query processor, in any database product intelligently reads and processes the data in an optimized "query plan" that is at one or more levels of abstraction lower (closer to the data) than the set-based representation you deal with in any SQL-language. Google "Hash-Joins", "Merge Joins", or "query Optimization" to learn more.
The statement is only making the point that the results of the sub-query will be different based on the data in the row of the outer resultset, because they are dependent on that data, whereas an un-correlated sub-query, because it is not dependent on the row data in the outer resultset, will be the same for every outer row.
Upvotes: 0
Reputation: 1271151
First, if you have a question about how databases do things, tag with the database you are using.
Second, your query is non-sensical and will be rejected by most databases. You need either a comparison (such as =
or in
) or exists
for the subquery.
Third, this statement is incorrect:
I know that a correlated subquery runs for each row of the outer table.
What you should know is that SQL is a descriptive language, not a procedural language. A SQL query describes the output that you want. It does not mandate a particular type of processing. The SQL optimizer determines the best approach for running the query.
That said, some optimizers are smarter than others. And in some databases, a correlated subquery will always result in its being run once per row in the outer query. (And even this behavior is significantly mitigated by proper indexing.) This behavior is not a requirement of the language; it is a limitation of those databases.
Upvotes: 4