Reputation: 480
I need to compute the difference between two different rating averages, one for rating of movies before a specific year_cutoff (1980) and the second one after year_cutoff, querying from two different databases, Rating and Movies.
What I did is:
I need to compute the difference of the average stars between the movies pre-1980 and post-1980.
At first I GROUPED BY movie titles and compute the rating average for each group.
Secondly, I divided these groups into two categories, pre-1980 and post-1980
Finally, I am trying to recompute the average (of the average) for each of these two groups, namely avgBefore and avgAfter and compute the difference of this two new averages
I expect one number that is the avgBefore - avgAfter (avgB - avgA)
Below is my attempt for the code. My main problem is to insert correctly the two conditions 'before' and 'after' 1980. I am trying to define aliases like avgB and avgA, but apparently the UNION clause is not properly called.
SELECT AVG(avgB) - AVG(avgA)
FROM(
SELECT AVG(stars) as avgB
FROM Rating
JOIN Movie
ON Rating.mID = Movie.mID
GROUP BY title
HAVING year < 1980
UNION
SELECT AVG(stars) as avgA
FROM Rating
JOIN Movie
ON Rating.mID = Movie.mID
GROUP BY title
HAVING year > 1980
);
Upvotes: 0
Views: 116
Reputation: 3675
Try this:
SELECT AVG(avgB) - AVG(avgA)
FROM(
SELECT AVG(stars) as avgB
FROM Rating
JOIN Movie
ON Rating.mID = Movie.mID
WHERE Movie.year < 1980
GROUP BY title
UNION
SELECT AVG(stars) as avgA
FROM Rating
JOIN Movie
ON Rating.mID = Movie.mID
WHERE Movie.year >= 1980
GROUP BY title
);
NOTE: I don't know where you get the year
value from, so you will need to specify it.
UPDATE: fixed reference to year
.
UPDATE 2: Corrected query.
Here is what I came up. Please note that I don't have access right now to a DB to verify syntax, but I guess it should be mostly OK:
SELECT (AVG(B.Rating_OLD) - AVG(B.Rating_NEW)) AS Rating_Diff
FROM (
SELECT A.title , AVG(A.stars_OLD) AS Rating_OLD ,AVG(A.stars_NEW) AS Rating_NEW
FROM (
SELECT title ,
CASE
WHEN Movie.year < 1980 THEN Rating.stars
ELSE 0
END AS stars_OLD ,
CASE
WHEN Movie.year >= 1980 THEN Rating.stars
ELSE 0
END AS stars_NEW
FROM Rating
JOIN Movie
ON Rating.mID = Movie.mID
) A
GROUP BY A.title
) B
Upvotes: 0
Reputation: 95053
You want to get the average rating per movie first. From this result you want to get the average movie rating both pre-1980 and post-1980. Two steps hence: an aggregation per movie, then a total aggregation.
select
avg(r.stars) as total,
avg(case when m.year < 1980 then r.stars end) as pre1980,
avg(case when m.year > 1980 then r.stars end) as post1980,
avg(case when m.year < 1980 then r.stars end) -
avg(case when m.year > 1980 then r.stars end) as diff
from
(
select mid, avg(stars) as stars
from rating
group by mid
) r
join movie m on m.mid = r.mid;
(As already mentioned, you may want to include 1980 movies in either the pre or the post range and not omit them completly.)
Upvotes: 1
Reputation: 1270371
The most efficient way to do this uses conditional aggregation:
SELECT (AVG(CASE WHEN m.year < 1980 THEN r.stars END) -
AVG(CASE WHEN m.year >= 1980 THEN r.stars END)
) AS averag
FROM Rating r JOIN
Movie m
ON r.mID = m.mID;
Trying to use two separate queries just makes the query more complicated than it needs to be.
Upvotes: 0
Reputation: 1399
Use join instead on UNION:
SELECT
AVG(avgB) - AVG(avgA)
FROM
(
(
SELECT
AVG(stars) as avgB
FROM
Rating
JOIN
Movie
ON
Rating.mID = Movie.mID
WHERE
year < 1980
) t1
CROSS JOIN
(
SELECT
AVG(stars) as avgA
FROM
Rating
JOIN
Movie
ON
Rating.mID = Movie.mID
WHERE
year < 1980
) t2
);
Upvotes: 0
Reputation: 5656
What if we follow this approach:
SELECT
(AVG(CASE WHEN year < 1980 THEN stars ELSE 0 END)-
AVG(CASE WHEN year >= 1980 THEN stars ELSE 0 END)) AS averag
FROM Rating
JOIN Movie ON Rating.mID = Movie.mID
Upvotes: 0
Reputation: 84
You are doing it wrong way. You can't have two different aliases in UNION. Your script knows only about alias avgB. Instead of UNION i suggest that you use JOIN.
Upvotes: 0