mike
mike

Reputation: 2780

SQL query not returning unique results. Which type of join do I need to use?

I'm trying to run the following MySQL query:

SELECT * 
FROM user u 
JOIN user_categories uc ON u.user_id = uc.user_id 
WHERE (uc.category_id = 3 OR uc.category_id = 1)

It currently returns:

Joe,Smith,60657,male
Joe,Smith,60657,male
Mickey,Mouse,60613,female
Petter,Pan,60625,male
Petter,Pan,60625,male
Donald,Duck,60615,male

If the user belongs to both categories it currently returns them twice. How can I return the user only once without using SELECT DISTINCT, regardless of how many categories they belong to?

Upvotes: 1

Views: 267

Answers (3)

Andrew Lazarus
Andrew Lazarus

Reputation: 19340

I don't know about MySQL, but in Postgres you may get better performance in the semi-join version from

SELECT * FROM user u 
WHERE u.user_id 
IN (SELECT user_id FROM user_categories uc WHERE uc.category_id IN (1,3));

I would expect SELECT DISTINCT to run fastest but I have learned my expectations and DB performance are often much different!

Upvotes: 3

armonge
armonge

Reputation: 3138

Try using a GROUP BY

SELECT * FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE uc.category_id = 3 OR uc.category_id = 1
GROUP BY u.user_id

Upvotes: 0

Martin Smith
Martin Smith

Reputation: 453707

You need a semi join. This can be achieved with a sub query.

SELECT * 
FROM user u 
WHERE EXISTS(SELECT * 
       FROM user_categories uc 
       WHERE u.user_id = uc.user_id AND  
       uc.category_id IN(1,3))

In MySQL the performance of sub queries is quite problematic however so a JOIN and duplicate elimination via DISTINCT or GROUP BY may perform better.

Upvotes: 6

Related Questions