Nick Woodhams
Nick Woodhams

Reputation: 12717

Complex MySQL Query - Find duplicates PER user_id?

I have a database of Facebook Likes from several people. There are duplicate "like_id" fields across many "user_id"s. I want a query that will find the amount of "like_id"s person A has in common with person B.

This query is fantastic for comparing likes when only 2 "user_id"s are in the database, but as soon as I add a 3rd, it messes it up. Basically, I want to see who has the most "likes" in common with with person A.

  SELECT *, 
         COUNT(*)
    FROM likes
GROUP BY like_id
  HAVING COUNT(*) > 1

Anyone have a query that might work?

Upvotes: 0

Views: 170

Answers (2)

ObscureRobot
ObscureRobot

Reputation: 7336

I think this will do it:

SELECT
  likes_a.user_id,
  likes_b.user_id 
FROM
  likes as likes_a JOIN likes as likes_b
ON
  likes_a.like_id = likes_b.like_id 
WHERE
  likes_a.user_id <> likes_b.user_id

And then post-process the results to count up who has the most in common.

Upvotes: 0

thait84
thait84

Reputation: 198

This SQL should work. You just need to put in the User A's user_id and it should compare with all other users and show the top matching one. You can change it to show the top 5 or do whatever else you need to do.

Basically what it is doing is that it is doing a self join on the table, but making sure that when it does a join, it is a different user_id but the "like" is the same. Then it does a group by each of the other user_id's and sums the same amount of likes for that user_id.

SELECT  all_other_likes.user_id, count(all_other_likes.like_id) AS num_similar_likes
FROM    likes original_user_likes
JOIN    likes all_other_likes
ON      all_other_likes.user_id != original_user_likes.user_id
AND     original_user_likes.like_id = all_other_likes.like_id
WHERE   original_user_likes = USER_ID_YOU_WANT_TO_COMPARE
GROUP BY all_other_likes.user_id
ORDER BY count(all_other_likes.like_id) DESC
LIMIT 1;

Not sure what database you are using. You might need to do a SELECT TOP 1 if it is MS-SQL, but this is valid PostgreSQL and MySQL syntax.

Upvotes: 1

Related Questions