Reputation: 366
Question:
Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don't just calculate the overall average rating before and after 1980.)
Note: This is in SQL Lite.
Schema:
Movie ( mID, title, year, director )
English: There is a movie with ID number mID, a title, a release year, and a director.
Reviewer ( rID, name )
English: The reviewer with ID number rID has a certain name.
Rating ( rID, mID, stars, ratingDate )
English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certain ratingDate.
Below is my attempt at solving this. Here is my logic and breakdown of my mess of a query.
The question above says (Make sure to calculate the average rating for each movie and THEN the average of those averages for movies before 1980 and movies after). For this part of the query below, I sought to create a table of all the average ratings of every movie. I know I want to reference those Averages later as I need to average them AGAIN so I want to create a Table I can use later which is why I chose to put a subquery in the From clause.
(SELECT avg(Stars) as avgStars from Rating) as Avg1,
I then reference the table I tried to create above as I want to average those original averages and then separate the movies by year. This query below is Movies released before 1980. I know I want to reference this later as well in the select clause so I rename it Pre1980.
(SELECT avg(avgStars) from Avg1 where mID in
(SELECT mID from Movie where year < 1980)) as Pre1980,
(SELECT avg(avgStars) from Avg1 where mID in
(SELECT mID from Movie where year > 1980)) as Post1980;
SELECT Post1980.AvgStars - Pre1980.AvgStars
Attempted Solution
SELECT Post1980.AvgStars - Pre1980.AvgStars
FROM
(SELECT avg(Stars) as avgStars from Rating) as Avg1,
(SELECT avg(avgStars) from Avg1 where mID in
(SELECT mID from Movie where year < 1980)) as Pre1980,
(SELECT avg(avgStars) from Avg1 where mID in
(SELECT mID from Movie where year > 1980)) as Post1980;
My Problem
I get an error says Query Failed to execute: no such table: Avg 1
. Is there something wrong that I'm doing in my steps? In addition, I feel like I'm making more mistakes than just the renaming so if there are any mistakes in my logic please help! I would love to learn the logic behind this query.
Upvotes: 3
Views: 1739
Reputation: 1
--The correct result is 0.0555555555555558
SELECT avg(avgBefore) - avg(avgAfter)
FROM
(SELECT avg(r1.stars) AS avgBefore
FROM Rating r1
JOIN Movie m ON r1.mID = m.mID
WHERE m.year < 1980
GROUP BY r1.mID) TabBefore ,
(SELECT avg(r1.stars) AS avgAfter
FROM Rating r1
JOIN Movie m ON r1.mID = m.mID
WHERE m.year > 1980
GROUP BY r1.mID) TabAfter
Upvotes: 0
Reputation: 19
Use the following code:
SELECT AVG(S1)-AVG(S2)
FROM(
SELECT AVG(STARS) S1
FROM MOVIE M,RATING R
WHERE M.MID=R.MID and year<1980
GROUP BY M.MID
),
(SELECT AVG(STARS) S2
FROM MOVIE M,RATING R
WHERE M.MID=R.MID and year>1980
GROUP BY M.MID);
Upvotes: 0
Reputation: 222412
As explictly stated in the question, you would need two levels of aggregation to get the job done. I would recommend using conditional aggregation to compute the averages per period.
This first query gives you the average ratings for each movie:
SELECT m.mID, m.year, AVG(r.stars * 1.0) avg1
FROM Movie m
INNER JOIN Rating r ON m.mID = r.mID
GROUP BY m.mID, m.year
Now you can easily turn this to a subquery to aggregate again:
SELECT
COALESCE(AVG(CASE WHEN year < 1980 THEN avg1 END), 0)
- COALESCE(AVG(CASE WHEN year >= 1980 THEN avg1 END), 0) rating_diff
FROM (
SELECT m.mID, m.year, AVG(stars * 1.0) avg1
FROM Movie m
INNER JOIN Rating r ON m.mID = r.mID
GROUP BY m.mID, m.year
) x
Upvotes: 1