Reputation: 1241
I was currently solving the Databases exercises on Coursera and encountered a problem.
You've started a new movie-rating website, and you've been collecting data on reviewers' ratings of various movies. There's not much data yet, but you can still try out some interesting queries. Here's the schema:
Movie ( mID, title, year, director )
Reviewer ( rID, name )
Rating ( rID, mID, stars, ratingDate )
Find the titles of all movies that have no ratings.
SELECT DISTINCT title
FROM Movie, Rating
WHERE Movie.mID NOT IN
(SELECT mID
FROM Rating)
SELECT DISTINCT title
FROM Movie, Rating
WHERE NOT EXISTS(SELECT * FROM Rating
WHERE Rating.mID = Movie.mID)
SELECT title
FROM Movie
WHERE title NOT IN(SELECT title
FROM Movie NATURAL JOIN Rating)
I wanted to know whether this can be optimised because I applied DISTINCT
to make them run.
Upvotes: 0
Views: 119
Reputation: 60493
well, first, you don't need to select from Movie,
Rating, as you don't do anything with Rating table.
And this is doing an horrible (and useless) cross join => cartesian product, which force you do make a Distinct.
so
select m.title
from Movie m
where not exists (select null -- or select *, doesn't change anything
from Rating r
where r.mId = m.mID)
you could also do something like that, but I think the previous one is clearer.
select m.title
from Movie m
left join Rating r on r.mId = m.mID
where r.rID is null
A NOT IN
query (without join on Rating) is also perfectly valid, but probably the one with poorest performance, and would be written this way.
select m.title
from Movie m
where m.mId not in (select distinct mID from Rating)
Upvotes: 3
Reputation: 1035
Including Rating in the outer FROM is not needed, without it you can avoid DISTINCT.
Upvotes: 0