Maggie Liu
Maggie Liu

Reputation: 366

SQL Stanford Mini Course - Difference between average Rating of Movies: logic behind problem + no such table error

Question:

Find the difference between the average rating of movies released before 1980 and the average rating of movies released after 1980. (Make sure to calculate the average rating for each movie, then the average of those averages for movies before 1980 and movies after. Don't just calculate the overall average rating before and after 1980.)

Note: This is in SQL Lite.

Schema:

Movie ( mID, title, year, director )

English: There is a movie with ID number mID, a title, a release year, and a director.

Reviewer ( rID, name ) 

English: The reviewer with ID number rID has a certain name.

Rating ( rID, mID, stars, ratingDate ) 

English: The reviewer rID gave the movie mID a number of stars rating (1-5) on a certain ratingDate.

Below is my attempt at solving this. Here is my logic and breakdown of my mess of a query.

  1. The question above says (Make sure to calculate the average rating for each movie and THEN the average of those averages for movies before 1980 and movies after). For this part of the query below, I sought to create a table of all the average ratings of every movie. I know I want to reference those Averages later as I need to average them AGAIN so I want to create a Table I can use later which is why I chose to put a subquery in the From clause.

    (SELECT avg(Stars) as avgStars from Rating) as Avg1,

  2. I then reference the table I tried to create above as I want to average those original averages and then separate the movies by year. This query below is Movies released before 1980. I know I want to reference this later as well in the select clause so I rename it Pre1980.

    (SELECT avg(avgStars) from Avg1 where mID in (SELECT mID from Movie where year < 1980)) as Pre1980,

    1. Finally, I create an average of the original averages in Step 1 for the movies released after 1980 and rename it so I can reference it again in the Select Clause.

    (SELECT avg(avgStars) from Avg1 where mID in (SELECT mID from Movie where year > 1980)) as Post1980;

    1. My last step is the difference between the average ratings of movies released before 1980 and the average of ratings of movies released after 1980.

    SELECT Post1980.AvgStars - Pre1980.AvgStars

Attempted Solution

SELECT Post1980.AvgStars - Pre1980.AvgStars 
        FROM 
           (SELECT avg(Stars) as avgStars from Rating) as Avg1, 
           (SELECT avg(avgStars) from Avg1 where mID in 
           (SELECT mID from Movie where year < 1980)) as Pre1980, 
           (SELECT avg(avgStars) from Avg1 where mID in 
           (SELECT mID from Movie where year > 1980)) as Post1980;

My Problem

I get an error says Query Failed to execute: no such table: Avg 1. Is there something wrong that I'm doing in my steps? In addition, I feel like I'm making more mistakes than just the renaming so if there are any mistakes in my logic please help! I would love to learn the logic behind this query.

Upvotes: 3

Views: 1739

Answers (3)

Filipe Elias
Filipe Elias

Reputation: 1

--The correct result is 0.0555555555555558

SELECT avg(avgBefore) - avg(avgAfter)
FROM

  (SELECT avg(r1.stars) AS avgBefore
   FROM Rating r1
   JOIN Movie m ON r1.mID = m.mID
   WHERE m.year < 1980
   GROUP BY r1.mID) TabBefore ,

  (SELECT avg(r1.stars) AS avgAfter
   FROM Rating r1
   JOIN Movie m ON r1.mID = m.mID
   WHERE m.year > 1980
   GROUP BY r1.mID) TabAfter

Upvotes: 0

Amuthini
Amuthini

Reputation: 19

Use the following code:

SELECT AVG(S1)-AVG(S2) 
FROM(
SELECT AVG(STARS) S1 
FROM MOVIE M,RATING R 
WHERE M.MID=R.MID and year<1980
GROUP BY M.MID
),
(SELECT AVG(STARS) S2 
FROM MOVIE M,RATING R 
WHERE M.MID=R.MID and year>1980
GROUP BY M.MID);

Upvotes: 0

GMB
GMB

Reputation: 222412

As explictly stated in the question, you would need two levels of aggregation to get the job done. I would recommend using conditional aggregation to compute the averages per period.

This first query gives you the average ratings for each movie:

SELECT m.mID, m.year, AVG(r.stars * 1.0) avg1
FROM Movie m
    INNER JOIN Rating r ON m.mID = r.mID
GROUP BY m.mID, m.year

Now you can easily turn this to a subquery to aggregate again:

SELECT
    COALESCE(AVG(CASE WHEN year < 1980 THEN avg1 END), 0)
       - COALESCE(AVG(CASE WHEN year >= 1980 THEN avg1 END), 0) rating_diff
FROM (
    SELECT m.mID, m.year, AVG(stars * 1.0) avg1
    FROM Movie m
    INNER JOIN Rating r ON m.mID = r.mID
    GROUP BY m.mID, m.year
) x

Upvotes: 1

Related Questions