Reputation: 4496
I have a database that contains 2 tables:
Users (user_id primary key), and Friends
The friends table is organized into 2 columns friend1, friend2, both of which contain user_ids as (foreign keys referencing Users). In each friend pair, friend1's user id is less than friend 2's user id.
I am trying to find a list of users who are not friends but who share the greatest amount of friends.
I have managed to do this in two separate queries:
Number of Shared Friends for users u1 and u2
select count(*)
from
((select friend1 from friends where friend2 = u1 UNION
select friend2 from friends where friend1 = u1)
INTERSECT
(select friend1 from friends where friend2 = u2 UNION
select friend2 from friends where friend1 = u2))
;
Set of all user_id -> user_id pairs who are not friends:
select distinct
u1.user_id as friend1,
u2.user_id as friend2
from
users u1,
users u2
where
u1.user_id < u2.user_id
minus
select friend1, friend2
from friends order by friend1;
However my ultimate goal is to get a result that is
user1 user2 shared_friends
such that user1 < user2, and user1 and user2 are not friends, and shared_friends is the count of how many friends these users have in common, which I have thus far been unsuccessful in achieving.
Upvotes: 1
Views: 387
Reputation: 64674
The CTE's are just to provide some sample data.
With Users As
(
Select 1 As UserId, 'Alice' As Name
Union All Select 2, 'Bob'
Union All Select 3, 'Caroline'
Union All Select 4, 'Doug'
)
, Friends As
(
Select 1 As Friend1, 2 As Friend2
Union All Select 2, 1
Union All Select 2, 3
Union All Select 2, 4
Union All Select 3, 1
Union All Select 3, 4
)
, UserFriends As
(
Select U1.UserId
, Case
When F1.Friend1 = U1.UserId Then F1.Friend2
Else F1.Friend1
End As Friend
From Users As U1
Join Friends As F1
On U1.UserId In(F1.Friend1,F1.Friend2)
Group By U1.UserId
, Case
When F1.Friend1 = U1.UserId Then F1.Friend2
Else F1.Friend1
End
)
Select U1.Name, U2.Name
, Count(*) As MutualFriendCount
, Group_Concat(F.Name) As SharedFriends
From UserFriends As UF1
Join UserFriends As UF2
On UF2.Friend = UF1.Friend
Join Users As U1
On U1.UserId = UF1.UserId
Join Users As U2
On U2.UserId = UF2.UserId
Join Users As F
On F.UserId = UF1.Friend
And F.UserId = UF2.Friend
Where UF1.UserId <> UF2.UserId
And Not Exists (
Select 1
From UserFriends As F3
Where F3.UserId = UF1.UserId
And F3.Friend = UF2.UserId
)
Group By U1.Name, U2.Name
Order By U1.Name, U2.Name
Upvotes: 0
Reputation: 15105
This should get you started, but it is Microsoft SQL (although pretty generic)
select a1.friend1 as User1,
a2.friend1 as user2,
count( distinct a1.friend2) as Shared_Friends
from friends a1
join (select distinct friend1,friend2 from friends a2) a2
on a1.friend2=a2.friend2
left join friends a3 on a3.friend1=a1.friend1 and a3.friend2 = a2.friend1
where (a1.friend1=1 and a2.friend1=8) and a3.friend1 is null
group by a1.friend1,a2.friend1
Upvotes: 0