damola
damola

Reputation: 292

Correlated Subquery and Joins

Please can anyone give me the difference between using a Correlated sub-query over a JOIN ? Are they not meant to solve the same problem. Is there a performance Hit over one. I am using SQL server and I tend to use FOR XML PATH('') to select multiple columns that occur in subqueries. Any example to explain the difference will be appreciated.

Upvotes: 0

Views: 3066

Answers (2)

Erwin Smout
Erwin Smout

Reputation: 18408

"Please can anyone give me the difference between using a Correlated sub-query over a JOIN ? Are they not meant to solve the same problem ?"

No they aren't.

With correlated subqueries, you get an expression for an invocation of either the SEMIJOIN or else the SEMIMINUS of the relational algebra. (Note that I'm talking here of the most typical use of correlated subqueries.)

SELECT * FROM A1
 WHERE attr in (SELECT attr FROM A2);

gives you only the columns that appear in A1. Unlike what you would get if you wrote the JOIN

SELECT * FROM A1 NATURAL JOIN A2

which would also give you any additional columns from A2 that do not appear in A1.

And

SELECT * FROM A1
 WHERE attr NOT IN (...)

also involves a correlated subquery, but this time it is a semiminus invocation which cannot even be written as a JOIN.

And finally, correlated subqueries can also appear in the SELECT list.

SELECT a,b, ( SELECT MIN(b) FROM T AS sub WHERE sub.a=main.a AND sub.b>main.b )
  FROM T AS main

In this case too, many queries of this nature cannot be written as a JOIN.

Upvotes: 1

RichardTheKiwi
RichardTheKiwi

Reputation: 107826

As usual, it's horses for courses.

However, there is a loose rule that if you can do it with JOINs, then it works better because you get two streams of data being matched together using SQL Server's set-based operations.

Correlated subqueries however are the only way to solve certain problems, but they result in a loop-like behaviour, one for each row - necessarily slower and performs worse.

Upvotes: 2

Related Questions