Reputation: 10571
I'm learning SQL and happen to hang on a point that confuses me:
I got 1 table with 1 column (boys.boy) while boy is a varchar.
SELECT boys.boy, boys.boy
FROM boys
LEFT JOIN boys
ON length(boy) > length(boy)
Why am I unable to do this? Would this break some invariant, or why did the inventors explicitly bring in "self-joins"?
In other words, this works ("self join"):
SELECT b1.boy, b2.boy
FROM boys AS b1
INNER JOIN boys AS b2
ON length(b1.boy) > length(b2.boy)
The magic lies in the aliases (AS ...).
Upvotes: 1
Views: 3198
Reputation: 26856
You have to use aliases in this case.
Otherwise server can't distinguish one boys.boy
from another, and this construction
length(boy) > length(boy)
is pretty ambiguous - since it can be interpreted as
length(b1.boy) > length(b2.boy)
or
length(b2.boy) > length(b1.boy)
or even
length(b1.boy) > length(b1.boy)
Update
Consider this simple snippet (t-sql used here):
declare @boys table (boy nvarchar(128))
insert into @boys
select 'Ed'
union
select 'Tom'
union
select 'Nick'
select b1.boy, b2.boy
from @boys as b1
left outer join @boys as b2 on len(b1.boy) > len(b2.boy)
select b1.boy, b2.boy
from @boys as b1
left outer join @boys as b2 on len(b2.boy) > len(b1.boy)
Output from first query will be
Ed NULL
Nick Ed
Nick Tom
Tom Ed
And from the second one:
Ed Nick
Ed Tom
Nick NULL
Tom Nick
Explanation:
Let's have a look at first query. It is basically: "take all records from table and combine each record with all other records having length of boy's name less than that record has". That's why whe getting Ed, Null
pair - there are no records having length of name less than Ed
has.
But in the second query condition is "take all records from table and combine each record with all other records having length of boy's name greater than that record has". That's why we're getting pairs Ed, Nick
and Ed, Tom
in this case.
Upvotes: 8
Reputation: 239684
ON
clauses can contain arbitrary predicates. Although it's common to see a join like this:
FROM Table1
INNER JOIN
Table2
ON
Table1ColumnA = Table2ColumnC and
Table1ColumnB = Table2ColumnD
it is by no means required that the columns on the left are references from the first table and columns on the right are references to the second table.
For instance, when trying to find intervals that overlap, I would commonly construct a join as:
FROM
Table1 t1
INNER JOIN
Table1 t2
ON
t1.Start < t2.End AND
t2.Start < t1.End AND
t1.ID != t2.ID
Note that the tables being used in each comparison are the opposite way around.
This also means that you can do more complex joins:
FROM
Table1 t1
LEFT JOIN
Table2 t2
ON
t2.ID = t1.ID
LEFT JOIN
Table3 t3
ON
t1.ColB = t3.ColF
INNER JOIN
Table4 t4
ON
t4.ColA = t2.ColC OR
t3.ColZ = t4.ColY
Which is effectively performing an INNER JOIN
between tables 1 and 4 using either Table 2 or Table 3 (or both).
As I said at the top - the ON
clauses can contain arbitrary predicates.
Upvotes: 0