Reputation: 12717
I have a database of Facebook Likes from several people. There are duplicate "like_id" fields across many "user_id"s. I want a query that will find the amount of "like_id"s person A has in common with person B.
This query is fantastic for comparing likes when only 2 "user_id"s are in the database, but as soon as I add a 3rd, it messes it up. Basically, I want to see who has the most "likes" in common with with person A.
SELECT *,
COUNT(*)
FROM likes
GROUP BY like_id
HAVING COUNT(*) > 1
Anyone have a query that might work?
Upvotes: 0
Views: 170
Reputation: 7336
I think this will do it:
SELECT
likes_a.user_id,
likes_b.user_id
FROM
likes as likes_a JOIN likes as likes_b
ON
likes_a.like_id = likes_b.like_id
WHERE
likes_a.user_id <> likes_b.user_id
And then post-process the results to count up who has the most in common.
Upvotes: 0
Reputation: 198
This SQL should work. You just need to put in the User A's user_id and it should compare with all other users and show the top matching one. You can change it to show the top 5 or do whatever else you need to do.
Basically what it is doing is that it is doing a self join on the table, but making sure that when it does a join, it is a different user_id but the "like" is the same. Then it does a group by each of the other user_id's and sums the same amount of likes for that user_id.
SELECT all_other_likes.user_id, count(all_other_likes.like_id) AS num_similar_likes
FROM likes original_user_likes
JOIN likes all_other_likes
ON all_other_likes.user_id != original_user_likes.user_id
AND original_user_likes.like_id = all_other_likes.like_id
WHERE original_user_likes = USER_ID_YOU_WANT_TO_COMPARE
GROUP BY all_other_likes.user_id
ORDER BY count(all_other_likes.like_id) DESC
LIMIT 1;
Not sure what database you are using. You might need to do a SELECT TOP 1 if it is MS-SQL, but this is valid PostgreSQL and MySQL syntax.
Upvotes: 1