Reputation: 73
Could someone explain me why the AVG() function gives me a weighted average in my code?
SELECT s.stud_id, s.country, SUM(e.paid) AS totalpaid
INTO #totalpaid
FROM oc.students AS s
JOIN oc.enrollment AS e ON s.stud_id = e.stud_id
GROUP BY s.country ,s.stud_id;
SELECT DISTINCT s.country, ROUND(AVG(t.totalpaid) OVER (PARTITION BY s.country),0) AS avg_country
FROM #totalpaid t
JOIN oc.students s ON t.stud_id = s.stud_id
JOIN oc.enrollment e ON e.stud_id = s.stud_id;
For example in the case of Malta Student12 attended 1 course and paid EUR45,Student837 attended 7 courses and paid EUR294 in total. I would like to have a simple (45+294)/2 calculation for the average however the system calculates like (1*45+7*294)/8. What am I doing wrong?
Upvotes: 1
Views: 310
Reputation: 73
In the meantime I found the solution:
SELECT
country,
ROUND(AVG(totalpaid) OVER (PARTITION BY country),0) AS avg_country
FROM #totalpaid;
Super easy :)
Upvotes: 1
Reputation: 2976
SELECT s.country, sum(e.paid) / count(DISTINCT s.stud_id) as average
FROM oc.students s
JOIN oc.enrollment e ON e.stud_id = s.stud_id
GROUP BY s.country;
Upvotes: 1
Reputation: 8101
In your second query, when you join your temp table back to enrollment
, it's generating one row per class; which is where the multiple values in the totalpaid
column are coming from.
The second query isn't using any columns that aren't already in the temp table, so you don't need those joins at all. This should produce what you're looking for.
SELECT
t.country,
ROUND(AVG(t.totalpaid) OVER (PARTITION BY t.country),0) AS avg_country
FROM #totalpaid t
GROUP BY
t.country;
Upvotes: 2
Reputation: 48770
Because you are joining your tables twice.
By putting together the INSERT
and SELECT
statements your query is equivalent to:
SELECT
DISTINCT s.country,
ROUND(AVG(t.totalpaid) OVER (PARTITION BY s.country),0) AS avg_country
FROM (
SELECT s.stud_id, s.country, SUM(e.paid) AS totalpaid
FROM oc.students AS s
JOIN oc.enrollment AS e ON s.stud_id = e.stud_id
GROUP BY s.country ,s.stud_id
) t
JOIN oc.students s ON t.stud_id = s.stud_id
JOIN oc.enrollment e ON e.stud_id = s.stud_id
There you can clearly see the table students
and enrollment
are joined twice. That would produce a skewed average function.
Upvotes: 2