Reputation: 606
I'm trying to select the average rating among all reviews, but also computing values such as total number of reviews, number of reviews grouped by rating, and the percentage of each rating group (e.g. "5": 10, "4": 7, "3": 5)
Problem is, the average rating is calculated incorrectly.
SQL
SELECT
rating,
count(*) as rating_count, /* number of ratings per group(e.g. 5: 10, 4: 7, 3: 5) */
avg(rating) over() as rating_avg, /* average rating among all reviews */
count(*) * 100 / sum(count(*)) over() as rating_share, /* percentage share of the rating group */
sum(count(*)) over() as total_reviews
FROM post_review
WHERE post_id = 1
GROUP BY rating;
Sample data
post_id, rating
1, 5,
1, 3,
1, 5,
Expected rating_avg
: 4.33
Resulted rating_avg
: 4
Question:
I know, that avg(rating) over() as rating_avg,
is not a correct statement for selecting the average rating in this case and I can't come up with a proper solution.
Is that possible to do all those computations in one SQL statement?
EDIT: Detailed Result
rating | rating_count | rating_avg | rating_share | total_reviews
__________________________________________________________________
3 |1 | 4 | 33.33 | 3
5 |2 | 4 | 66.67 | 3
Upvotes: 0
Views: 129
Reputation: 5245
You can achieve this with a cte, instead of window functions. Something like this:
with cte as
(SELECT avg(rating) rating_avg,
count(*) total_reviews, post_id
FROM post_review
WHERE post_id = 1
GROUP BY post_id
)
SELECT rating,
count(*) as rating_count,
cte.rating_avg,
(count(*) * 100)::numeric / cte.total_reviews::numeric as rating_share,
cte.total_reviews
FROM post_review
inner join cte ON post_review.post_id = cte.post_id
WHERE post_review.post_id = 1
GROUP BY rating, cte.rating_avg, cte.total_reviews;
The problem with your SQL is the GROUP BY. The average rating is 4 because the GROUP BY condenses the two ratings of 5 into 1, so that the average is the average of 5 and 3 (=4) rather than 5, 5, and 3.
As an alternative using sum over()
you can do:
SELECT
rating,
count(*) as rating_count,
sum(rating * count(*)) over() / sum(count(*)) over() as rating_avg,
count(*) * 100 / sum(count(*)) over() as rating_share,
sum(count(*)) over() as total_reviews
FROM post_review
WHERE post_id = 1
GROUP BY rating;
Upvotes: 1