Reputation: 1655
This is a question from Stanford online database course exercise. Find the movie(s) with the highest average rating. Return these movie title(s) and their average rating. Use SQLite.
I've seen solutions kindly suggested by others, e.g,
But what I hope to understand here is where and why my current solution present here went wrong.
The movie rating table:
rID mID stars ratingDate
201 101 2 2011-01-22
201 101 4 2011-01-27
203 103 2 2011-01-20
203 108 4 2011-01-12
203 108 2 2011-01-30
204 101 3 2011-01-09
205 103 3 2011-01-27
205 104 2 2011-01-22
...
Note: mID represents movie ID, rID represents reviewer ID, stars represent the scores ranked by reviewers.
My first thought is to get the average scores for each movie, using code as following:
Select mID, avg(stars) AS avg_stars
From Rating
Group by mID
The resulting summary table is
mID avg_stars
101 3.0
103 2.5
104 2.5
106 4.5
107 4.0
108 3.3
Then I want to select the max values of the scores column and the assciated mIDs
Select mID, max(avg_stars) AS Max_score
From (
Select mID, avg(stars) AS avg_stars
From Rating
Group by mID) T
I expected to get:
mID Max_score
106 4.5
But instead, I get:
mID Max_score
108 4.5
Upvotes: 2
Views: 12624
Reputation: 60462
You seem to use MySQL as a DBMS, which allows non-Standard syntax:
When you return mID
without adding it to GROUP BY
MySQL returns a single row with the maximum(average) and a random value for mID.
This is a generic Standard SQL solution:
Select mID, avg(stars) AS avg_stars
From Rating
Group by mID
having avg(stars) =
( select max(avg_stars) -- find the max of the averages
from
( Select mID, avg(stars) AS avg_stars
From Rating
Group by mID
) T
)
This might be very inefficient, that's why there are several proprietary syntax extensions. Most DBMSes (but not MySQL) support Standard SQL's Windowed Aggregate Functions:
select *
from
(
Select mID, avg(stars) AS avg_stars,
max(avg(stars)) as max_avg
From Rating
Group by mID
) T
where avg_stars = max_avg
Edit:
As you added SQLite as DBMS, my 2nd query will not work (SQLite doesn't support Analytical Functions, too).
But as WITH
is supported you can simplify #1 to a query similar to @user3580870's:
with cte as
( Select mID, avg(stars) AS avg_stars
From Rating
Group by mID
)
select * from cte
where avg_stars =
( select max(avg_stars) -- find the max of the averages
from cte
);
And this is still Standard SQL compliant...
Upvotes: 4
Reputation: 18217
Perhaps a WITH clause can do the trick. Cannot currently access a live DB to test, but the query should look like:
WITH sq AS
(SELECT mID, avg(stars) AS avg_stars FROM rating GROUP BY mID)
SELECT mId,avg_stars FROM sq t1 JOIN
(SELECT max(avg_stars) AS max_avg FROM sq) t2
ON t1.avg_stars = t2.max_avg;
SQLite supports WITH clauses from version 3.8.3. This code was actually tested on the data in the question. It also makes the calculation explicitly more efficient than other solutions (though they may be optimized by some smart query analysis). Furthermore, it is clearest and the WITH
clause can offer the path for further twists easily.
Upvotes: 2
Reputation: 903
This is just an extension to @SMA's answer.
If you want to consider duplicates records as single entity you can use the following query
SELECT mID,AVG(stars)
FROM Rating
WHERE AVG(stars) IN (
(SELECT AVG(stars)
FROM Rating
GROUP BY mID
ORDER BY avg_stars DESC LIMIT 1)
)
GROUP BY mID
The above query might not be an optimized one.
Upvotes: 1
Reputation: 37023
Instead of subquery try using order by and limit to first result:
SELECT mID, AVG(stars) AS avg_stars
FROM Rating
GROUP BY mID
ORDER BY avg_stars DESC LIMIT 1;
Upvotes: 2