Reputation: 57
Basically what i am trying to do is to suggest people based on common interests.
I have a table of Users.
I have a table of Interested_People where UserID + InterestID is stored.
I have a table of Contactlist where people who are added with each other is stored.
What I want is to only output people who are not your friends.
I searched a lot in internet but couldn't find something like so.
Although I created a query but it is very slow. Now I Kindly request you guys if you can edit my query a bit and make it much more bandwidth & time efficient.
SELECT *
FROM users
WHERE id IN(SELECT userid
FROM interested_people
WHERE interested_in IN(SELECT interested_in
FROM interested_people
WHERE userid = [userid])
AND id NOT IN(SELECT user1 AS my_friends_userid
FROM contactlist f
WHERE f.user2 = [userid]
AND accepted = 1
UNION
SELECT user2 AS my_friends_userid
FROM contactlist f
WHERE f.user1 = [userid]
AND accepted = 1))
AND id != [userid]
ORDER BY Rand ()
LIMIT 0, 10;
This query actually does the job but it takes very long about 16 sec in my local machine. and that's not what I want. I want a fast and reliable one.
Thanks in advance!
Upvotes: 2
Views: 409
Reputation: 1140
I think this will give you the same results but perform a lot better:
SELECT * FROM Users u
INNER JOIN interested_people i
ON u.id = i.userid
WHERE NOT EXISTS
(SELECT * FROM contacts WHERE user1 = [userid] or user2 = [userid] and accepted=1)
AND id != [userid]
ORDER BY Rand()
LIMIT 0, 10
Skip the ORDER BY clause if that is at all reasonable. That will be the most expensive part
The select and join clauses give you the users who are interested in connecting and the WHERE NOT EXISTS is a performant way to exclude those contacts already listed.
Upvotes: 0
Reputation: 48139
First, looking at your interested-in query and assuming the "userID" you are testing with is = 1. Sounds like you are trying to get one level away from those user 1 is also interested in...
SELECT userid FROM interested_people
WHERE interested_in IN
( SELECT interested_in FROM interested_people
WHERE userid = [userid] )
Sample Data for Interested_People
userID Interested_In
1 5
1 7
1 8
2 3
2 5
2 7
7 1
7 2
7 5
8 3
In this case, the innermost returns interested_in values of 5, 7, 8. Then, getting all users who are interested in 5, 7 and 8 would return 2 and 7. (but since both users 2 and 7 are interested in 5, the 2 ID would be returned TWICE thus a possible duplicate join later on. I would do distinct. This same result could be done with the following query which you could sample times with...
SELECT distinct ip2.userid
from
interested_people ip
join interested_people ip2
ON ip.interested_in = ip2.interested_in
where
userid = [parmUserID]
Now, you need to exclude from this list all your contacts already accepted. You could then left-join TWO TIMES for the from/to contact and ensure NULL indicating not one of the contacts... Then join again to user table to get the user details.
SELECT
u.*
from
users u
JOIN
( SELECT distinct
ip2.userid
from
interested_people ip
join interested_people ip2
ON ip.interested_in = ip2.interested_in
left join contactList cl1
ON ip2.userid = cl1.user1
AND cl1.accepted = 1
left join contactList cl2
ON ip2.userid = cl2.user2
AND cl2.accepted = 1
where
ip.userid = [parmUserID]
AND NOT ip2.userID = [parmUserID] ) PreQuery
ON u.id = PreQuery.userID
order by
RAND()
limit
0, 10
I would have two indexes on your contactList table to optimize both left-joins... with user1 and user2 in primary position... Similarly for the interested_people table.
table index
contactList ( user1, accepted )
contactList ( user2, accepted )
interested_people ( userid, interested_in )
interested_people ( interested_in, userid )
I would expect your user table is already indexed on the ID as primary key.
Upvotes: 1
Reputation: 15951
Subqueries in WHERE clauses are often slow in MySQL; at least slower than comparable JOINs.
SELECT others.*
FROM interested_people AS userI
INNER JOIN interested_people AS othersI
ON userI.interestid = othersI.interestid
AND userI.userid <> othersI.userid
INNER JOIN users AS others ON othersI.user_id = others.userid
LEFT JOIN contactlist AS cl
ON userI.userid = cl.user1
AND others.userid = cl.user2
AND cl.accepted = 1
WHERE userI.userid = [userid]
AND cl.accepted IS NULL
ORDER BY RAND()
LIMIT 0, 10;
Note: intuition makes me wonder if contactlist might be better as a where subquery.
The AND cl.accepted IS NULL
ends up processed after the JOINs, resulting in allowing only results that did NOT have a match in contactlist.
If you want to enhance things a bit further:
SELECT others.*, COUNT(1) AS interestsCount
...
GROUP BY others.userid
ORDER BY interestsCount DESC, RAND()
LIMIT 0,10;
This would give you a random selection of the people that share the most interests in common.
Upvotes: 1