Reputation: 68658
I'm pulling a list of items from one table, on the basis of their being included in another table, like this:
select fruit.id, fruit.name from fruit, fruit_rating where fruit_rating.fruit_id=fruit.id group by fruit.name;
That works fine -- it basically produces a list of all the fruit that has been rated by someone. But now, I want to exclude all fruit that has been rated by one specific user, so I tried this:
select fruit.id, fruit.name from fruit, fruit_rating where fruit_rating.fruit_id=fruit.id and fruit_rating.user_id != 10 group by fruit.name;
That's ok, but not quite right. It shows all the fruit that have been rated by people other than 10, but if users 1 and 10 have both rated the same fruit, it still shows that one. Can anyone tell me how to construct a query that shows only the fruit that have NOT Been rated by user 10, regardless of who else has rated them?
Upvotes: 3
Views: 18687
Reputation: 86798
I read this differently from Cowan, and agree with Noah...
Find all fruit where: - User 10 did not rate it - Atleast one other user did rate it
However, in my experience using NOT IN can be quite slow. So, I generally prefer to filter using LEFT JOIN in the same way as Cowan. Here are a few different options, though I have not had time to test performance on large data sets...
SELECT
[f].id,
[f].name
FROM
fruit AS [f]
INNER JOIN
fruit_rating AS [fr]
ON [fr].fruit_id = [f].id
GROUP BY
[f].id,
[f].name
HAVING
SUM(CASE WHEN [fr_exclude].user_id = 10 THEN 1 ELSE 0 END) = 0
SELECT
[f].id,
[f].name
FROM
fruit AS [f]
INNER JOIN
fruit_rating AS [fr]
ON [fr].fruit_id = [f].id
LEFT JOIN
fruit_rating AS [fr_exclude]
ON [fr_exclude].fruit_id = [fr].fruit_id
AND [fr_exclude].user_id = 10
GROUP BY
[f].id,
[f].name
HAVING
MAX([fr_exclude].user_id) IS NULL
As this only works for one user, I would also consider making a table of "users to exclude" and LEFT JOIN on that instead...
SELECT
[f].id,
[f].name
FROM
fruit AS [f]
INNER JOIN
fruit_rating AS [fr]
ON [fr].fruit_id = [f].id
LEFT JOIN
excluded_users AS [ex]
AND [ex].user_id = [fr].user_id
GROUP BY
[f].id,
[f].name
HAVING
MAX([ex].user_id) IS NULL
Or something much more long winded, but I suspect is the fastest on larger data sets with appropriate indexes...
SELECT
[f].id,
[f].name
FROM
fruit [f]
INNER JOIN
(
SELECT
fruit_id
FROM
fruit_rating
GROUP BY
fruit_id
)
AS [rated]
ON [rated].fruit_id = [f].id
LEFT JOIN
(
SELECT
[fr].fruit_id
FROM
fruit_rating AS [fr]
INNER JOIN
excluded_users AS [ex]
ON [ex].user_id = [fr].user_id
GROUP BY
[fr].fruit_id
)
AS [excluded]
ON [rated].fruit_id = [excluded].fruit_id
WHERE
[excluded].fruit_id IS NULL
GROUP BY
[f].id,
[f].name
Upvotes: 4
Reputation: 37553
One thing that isn't quite 100% clear to me: do you want all fruit that hasn't been rated by user 10, or just fruit that has been rated by other people but not by user 10? e.g. should fruit that has no ratings be included?
I think you want all fruit (including unrated), in which case Noah and Mr. Brownstone's answers aren't quite what you're after. If you remove the inner join to fruit_rating, and the now-needless group by, theirs will include unrated fruit. An alternative approach, which avoids the subselect, is
select f.id, f.name
from fruit f
left join fruit_rating fr on
(f.id = fr.fruit_id)
and (fr.user_id = 10)
where
(fr.user_id is null)
That is, do a left join (an optional join, if you like) onto fruit rating ONLY for user 10, and then only return rows where a match WASN'T found.
Upvotes: 1
Reputation: 13959
I refined your query a bit to make it a little easier to read, and added a subquery to filter out all of the fruit that's been rated by user 10
select f.id, f.name
from fruit f
inner join fruit_rating fr on
fr.fruit_id = f.id
where f.id not in (
select id
from fruit_rating
where [user_id] = 10)
group by fruit.name;
Upvotes: 3
Reputation: 8479
... WHERE fruit_rating.fruit_id=fruit.id
and fruit.id not in
(select fruit_rating.fruit_id
from fruit_rating
where fruit_rating.user_id = 10)
Upvotes: 6