Reputation: 3437
I'm running MariaDB 10.3 which supports the MEDIAN
function, and in simple cases this function seems to work well. However I'm having trouble making it work in complex situations. Currently I'm trying to calculate the number, mean and median value for each of several types of item. Here's a MWE:
CREATE TABLE items (type INTEGER, value INTEGER);
INSERT INTO items (type, value) VALUES (1,1), (1,3), (1,4), (2,10), (2,32), (2,42);
SELECT type, COUNT(*) AS count, AVG(value) AS mean,
MEDIAN(value) OVER(PARTITION BY type) AS median FROM items GROUP BY type;
DROP TABLE items;
This calculates the count and mean correctly, but not the median:
type | count | mean | median |
---|---|---|---|
1 | 3 | 2.6667 | 1.0000000000 |
2 | 3 | 28.0000 | 10.0000000000 |
I expect it's some silly error in my SQL, but I can't for the life of me see it, and no errors or warnings are reported. I've tried dropping the PARTITION BY type
which changes the medians returned but they're still not correct. Removing the OVER
clause results in a syntax error.
I don't understand why this isn't working. If I just do the following, the median is calculated correctly, though with lots of duplicate lines, but adding GROUP BY type
gives the erroneous values above.
SELECT type, MEDIAN(value) OVER(PARTITION BY type) AS median FROM items;
Upvotes: 0
Views: 534
Reputation: 164064
This query:
SELECT type,
COUNT(*) AS count,
AVG(value) AS mean
FROM items
GROUP BY type;
is an aggregation query and if you add to it the window function MEDIAN()
it will be calculated on the results that it returns which are:
type | count | mean |
---|---|---|
1 | 3 | 2.6667 |
2 | 3 | 28.0000 |
with an arbitrary value for value
(which seems to be the first of each partition).
What you should do is get the median in a subquery and then aggregate:
SELECT type,
COUNT(*) AS count,
AVG(value) AS mean,
median
FROM (
SELECT *, MEDIAN(value) OVER (PARTITION BY type) AS median
FROM items
) t
GROUP BY type, median;
Or, use only window functions:
SELECT DISTINCT type,
COUNT(*) OVER (PARTITION BY type) AS count,
AVG(value) OVER (PARTITION BY type) AS mean,
MEDIAN(value) OVER (PARTITION BY type) AS median
FROM items;
See the demo.
Upvotes: 1