Reputation: 292
Please can anyone give me the difference between using a Correlated sub-query over a JOIN ? Are they not meant to solve the same problem. Is there a performance Hit over one. I am using SQL server and I tend to use FOR XML PATH('') to select multiple columns that occur in subqueries. Any example to explain the difference will be appreciated.
Upvotes: 0
Views: 3066
Reputation: 18408
"Please can anyone give me the difference between using a Correlated sub-query over a JOIN ? Are they not meant to solve the same problem ?"
No they aren't.
With correlated subqueries, you get an expression for an invocation of either the SEMIJOIN or else the SEMIMINUS of the relational algebra. (Note that I'm talking here of the most typical use of correlated subqueries.)
SELECT * FROM A1
WHERE attr in (SELECT attr FROM A2);
gives you only the columns that appear in A1. Unlike what you would get if you wrote the JOIN
SELECT * FROM A1 NATURAL JOIN A2
which would also give you any additional columns from A2 that do not appear in A1.
And
SELECT * FROM A1
WHERE attr NOT IN (...)
also involves a correlated subquery, but this time it is a semiminus invocation which cannot even be written as a JOIN.
And finally, correlated subqueries can also appear in the SELECT list.
SELECT a,b, ( SELECT MIN(b) FROM T AS sub WHERE sub.a=main.a AND sub.b>main.b )
FROM T AS main
In this case too, many queries of this nature cannot be written as a JOIN.
Upvotes: 1
Reputation: 107826
As usual, it's horses for courses.
However, there is a loose rule that if you can do it with JOINs, then it works better because you get two streams of data being matched together using SQL Server's set-based operations.
Correlated subqueries however are the only way to solve certain problems, but they result in a loop-like behaviour, one for each row - necessarily slower and performs worse.
Upvotes: 2