Reputation: 3615
I have two tables with given fields
HighSchooler
Likes
HighSchooler holds info on a high school student and Likes is a relationship table showing who likes who in the high school. Student with id1 likes student with id2. There are both one way relationships (when John likes Lucy but Lucy doesn't like John) and two way relationships (when Albert likes Sandra and Sandra also likes Albert).
I need a query that returns two columns with the names of those in a two way relationship, i.e if A likes B and B likes A, then a sample result-set would be
name | name
A B
I fiddle with it and came up with this query, but i don't understand it and don't think it is optimal.
SELECT DISTINCT a.name, b.name
FROM Highschooler a, Highschooler b, Likes l1
JOIN Likes l2 on l1.ID1=l2.ID2
WHERE a.ID=l1.ID2 AND b.ID=l1.ID1 AND a.ID=l2.ID1 AND a.ID > b.ID;
Upvotes: 1
Views: 1029
Reputation: 753525
The key self-join is between two references to the Likes table. This then needs to be joined twice to the HighSchoolers table to get the names of the two people.
SELECT l1.id1, l1.id2
FROM Likes AS l1
JOIN Likes AS l2
ON l1.id1 = l2.id2 AND l1.id2 = l2.id1;
This gives the list of pairs of IDs where each likes the other.
The only snag is that it gives each pair twice. So, the trick is to note that in one of the two rows, the id1
value is smaller than the id2
value. As a possibly beneficial side-effect, this eliminates anyone who likes themself.
SELECT l1.id1, l1.id2
FROM Likes AS l1
JOIN Likes AS l2
ON l1.id1 = l2.id2 AND l1.id2 = l2.id1;
WHERE l1.id1 < l1.id2
Now to tidy it up with names:
SELECT h1.name AS name1, h2.name AS name2
FROM (SELECT l1.id1, l1.id2
FROM Likes AS l1
JOIN Likes AS l2
ON l1.id1 = l2.id2 AND l1.id2 = l2.id1
WHERE l1.id1 < l1.id2
) AS p
JOIN HighSchoolers AS h1 ON p.id1 = h1.id
JOIN HighSchoolers AS h2 ON p.id2 = h2.id
p
is mnemonic for 'pairs'.
Upvotes: 1
Reputation: 6086
Your query is correct but it is using Cartesian product of tables which as you said is not optimal. When you write select * from a,b
all the rows of a and all the rows of b are combined together to form a new table which has size(a)*size(b) rows. You are doing that with three tables so you are creating a huge table and then select from it a few rows that you want. Inner join can do this more efficiently.
SELECT
a.name AS name_a, b.name AS name_b
FROM
HighSchooler AS a
INNER JOIN Likes AS l1
ON a.id = l1.id1
INNER JOIN Likes AS l2
ON l1.id1 = l2.id2 AND l1.id2 = l2.id1 AND l1.id1 < l1.id2
INNER JOIN HighSchooler AS b
ON l1.id2 = b.id;
please see fiddle:
Upvotes: 1
Reputation: 1875
Try joining Likes table with itself using the rule (l1.id1 = l2.id2) and (l1.id2 = l2.id1)
Example:
SELECT
a.name AS a_name,
b.name AS b_name
FROM
HighSchooler AS a
INNER JOIN Likes AS l1
ON (a.id = l1.id1)
INNER JOIN Likes AS l2
ON ((l1.id1 = l2.id2) AND (l1.id2 = l2.id1) AND (l1.id1 > l2.id1))
INNER JOIN HighSchooler AS b
ON (l2.id1 = b.id)
http://sqlfiddle.com/#!2/44a07/6
Upvotes: 3