Reputation: 325
Let's say I have 1000 users for my app. I ask them 100 questions with answers just yes/no and I record those answers in a seperate table.
Now, I want to see people who has given the same answers to at least 20 questions.
What kind of algorithm should I follow in order to do this? What are the relevant keywords for googling?
P.S. I work in a WAMP environment.
Upvotes: 0
Views: 57
Reputation: 4218
Join your answers table to itself, selecting answers which share the same question_id
and answer
but have a different user_id
. Group the rows by both user_id
s and use a HAVING
clause to exclude those with less than 20 matching answers.
Example where you are looking for users similar to your user with user_id
"1":
SELECT DISTINCT a2.user_id FROM answers a
INNER JOIN answers a2
ON a.question_id = a2.question_id
AND a.answer = a2.answer
AND a.user_id != a2.user_id
WHERE a.user_id = 1
GROUP BY a.user_id, a2.user_id
HAVING COUNT(*) >= 20;
Technically you don't need to group by a.user_id
in this case but I've left it there in case you want to modify the WHERE
clause to return results for more than one a.user_id
.
Upvotes: 4