Haofei Ying
Haofei Ying

Reputation: 11

Find pair of sets in one MySQL table

Suppose there is a table T describing people owning books. Fields are pid (person id), bid (book id):

-----------
|pid | bid|
-----------
| 1  |  1 | 
| 1  |  2 |
| 2  |  3 |
| 3  |  2 |
| 3  |  3 |
-----------

Now, how can I find all pairs of persons who do not own any books in common? In this example, it should return (1, 2).

Also, how to ensure it is not duplicated? For example, (1, 2) and (2, 1) are duplicates.

Upvotes: 1

Views: 124

Answers (1)

Chui Tey
Chui Tey

Reputation: 5564

Let's create a table to play with

create table #temp1 (pid int, bid int)

insert into #temp1 (pid, bid) values (1,1)
insert into #temp1 (pid, bid) values (1,2)
insert into #temp1 (pid, bid) values (2,3)
insert into #temp1 (pid, bid) values (3,2)
insert into #temp1 (pid, bid) values (3,3)

Step 1. This query finds all the people who share books in common

select distinct t1.pid, t2.pid
FROM #temp1 t1 
inner join #temp1 t2 
on t1.bid = t2.bid AND t1.pid <> t2.pid

pid         pid
----------- -----------
1           3
2           3
3           1
3           2

Step 2. But the problem is the values are duplicated. We can exclude them by making sure one pid is less than the other.

select distinct t1.pid, t2.pid
FROM #temp1 t1 
inner join #temp1 t2 
on t1.bid = t2.bid AND t1.pid <> t2.pid
where t1.pid < t2.pid

pid         pid
----------- -----------
1           3
2           3

Now it's a matter of finding all pairs of people and excluding them from the list which has books in common.

Step 3. We can get all pairs of people

SELECT distinct t1.pid, t2.pid
FROM #temp1 t1, #temp1 t2
where t1.pid < t2.pid

pid         pid
----------- -----------
1           2
1           3
2           3

Step 4. We need to use results from Step 3 and exclude results from Step 2

SELECT STEP4.* FROM
(SELECT distinct t1.pid AS PID1, t2.pid AS PID2
 FROM #temp1 t1, #temp1 t2
 where t1.pid < t2.pid) STEP4
WHERE NOT EXISTS (SELECT * FROM
    (select distinct t1.pid AS PID1, t2.pid AS PID2
     FROM #temp1 t1 
     inner join #temp1 t2 
     on t1.bid = t2.bid AND t1.pid <> t2.pid
     where t1.pid < t2.pid) STEP2
  where STEP2.PID1 = STEP4.PID1 AND STEP2.PID2 = STEP4.PID2)

PID1        PID2
----------- -----------
1           2

Upvotes: 2

Related Questions