Reputation: 628
I had an assignment question:
Find the celebs that have been in relationship with the same celeb.
The result should be (celeb1, celeb2, celeb3) triples,
meaning that celeb1 and celeb2 have been in relationship with celeb3.
Now the table "Relationships" have celeb1 and celeb2 fields..where the value is a VARCHAR.
My solution to the problem was:
CREATE VIEW Celeb1Rels AS
SELECT celeb1 AS c1, celeb2 AS c2 FROM relationships;
CREATE VIEW Celeb2Rels AS
SELECT celeb1 AS c2, celeb2 AS c3 FROM relationships;
SELECT * FROM Celeb1Rels NATURAL JOIN Celeb2Rels;
and it works fine. However the instructor posted his solution and he had:
SELECT X.celeb1, Y.celeb1, X.celeb2
FROM Relationships X, Relationships Y
WHERE X.celeb2=Y.celeb2 AND X.celeb1<Y.celeb1;
I don't understand why he is using X.celeb1 < Y.celeb1 It does work and give the right output, but I thought the "<" was used for comparing numbers?
Can anyone explain what the "<" is doing in this instance? and how it behaves when comparing VARCHARS?
Upvotes: 3
Views: 168
Reputation: 146239
The instructor's solution produces triples, where CELEB2 is in two relationships. This part of the WHERE clause ...
X.celeb1<Y.celeb1
... ensures that you only get rows for three different celebrities (i.e. it avoids matching the same record against itself) and you only get one row for each threesome.
Less than works exactly as you think it would, it sorts alphabetically. So 'ANDY GARCIA' < 'ANDY KAUFMAN'
is true.
Things to watch out for:
'andy garcia' < 'ANDY KAUFMAN'
is false.'11' < '2'
is true. "I had no idea about this sorting issue"
It is possible to make Oracle sort case insenstively, by setting the NLS_SORT parameter. However this won't change comparisons; for that we need to change the NLS_COMP parameter to LINGUISTIC
. These are not default behaviour simply because there's too much code out there which might rely on case-senstive sorting. Find out more.
Upvotes: 1
Reputation: 69769
It is so that you don't get duplicate relationships, Take for instance 2 hypothetical relationships between celeb1 and celeb2, and celeb1 and celeb3, you want the result
celeb2, celeb3, celeb1
You need the second inequality operator to ensure that the table doesn't join on the same relationship (i.e. celeb1 & Celeb2 joining back to celeb1 & celeb2). If you were to use your instructors query and amend it so instead of <
you were to use not equals <>
though, you would get the result:
celeb2, celeb3, celeb1
celeb3, celeb2, celeb1
But these rows show the same thing but in a different order, the >
inequality operator just ensures that the 2nd column is always a name alphabetically after the 1st column.
So to summarise the >
operator when applied to varchars works alphabetically, so 'a' < 'b'
, 'abc' > 'aaa'
etc.
Upvotes: 3
Reputation: 24144
I guess to exclude doubled records for example:
If (A,B)
and (B,C)
in this table then if query is without AND X.celeb1<Y.celeb1
we get
(A,B,C)
and (B,C,A)
in the output. Adding this condition we output only one record (A,B,C)
as soon as A < C
.
Upvotes: 2