Reputation: 153
I have the following table:
+------------+-------+
| SchoolName | Marks |
+------------+-------+
| A | 71 |
| A | 71 |
| A | 71 |
| B | 254 |
| B | 135 |
| B | 453 |
| B | 153 |
| C | 453 |
| C | 344 |
| C | 223 |
| B | 453 |
| D | 300 |
| D | 167 |
+------------+-------+
And here is the average of marks grouped by school names:
+------------+------------+
| SchoolName | avg(Marks) |
+------------+------------+
| A | 71.0000 |
| B | 289.6000 |
| C | 340.0000 |
| D | 233.5000 |
+------------+------------+
https://www.db-fiddle.com/f/5t7N3Vx8FSQmwUJgKLqjfK/9
However rather than average, I want to calculate median of the marks grouped by school names.
I am using,
SELECT AVG(dd.Marks) as median_val
FROM (
SELECT d.Marks, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
FROM tablename d, (SELECT @rownum:=0) r
WHERE d.Marks is NOT NULL
ORDER BY d.Marks
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );
to calculate the average of entire Marks column, but I don't know how to do it for each school separately.
Upvotes: 7
Views: 4581
Reputation: 222402
Your query computes row numbers using user variables, which makes it more complicated to handle partitions. Since you are using MySQL 8.0, I would suggest using window functions instead.
This should get you close to what you expect:
select
SchoolName,
avg(Marks) as median_val
from (
select
SchoolName,
Marks,
row_number() over(partition by SchoolName order by Marks) rn,
count(*) over(partition by SchoolName) cnt
from tablename
) as dd
where rn in ( FLOOR((cnt + 1) / 2), FLOOR( (cnt + 2) / 2) )
group by SchoolName
The arithmetic stays the same, but we are using window functions in groups of records having the same SchoolName
(instead of a global partition in your initial query). Then, the outer query filters and aggregates by SchoolName
.
In your DB Fiddlde, this returns:
| SchoolName | median_val |
| ---------- | ---------- |
| A | 71 |
| B | 254 |
| C | 344 |
| D | 233.5 |
Upvotes: 13