Reputation: 318
I have an SQL table with one column of integer values and I would like to find the median of those values without any sorting. I first tried finding the numbers in the table that are greater than and less than each value:
SELECT DISTINCT d1.v,
(SELECT COUNT(d2.v)
FROM Values d2
WHERE d2.v > d1.v
) AS greater_than,
(SELECT COUNT(d2.v)
FROM Values d2
WHERE d2.v < d1.v
) AS less_than
FROM Values d1
I'm not sure how to proceed. I believe I want the values in the above table where greater_than
and less_than
are both equal to num_entries / 2
, but that would only work for a table with an even number of entries. What's a good way to get the median from what I have above?
Upvotes: 3
Views: 1049
Reputation: 144
Look for total numbers for greater than and less than self number that is equal to the total number of rows/2.
create table med(id integer);
insert into med(id) values(1);
insert into med(id) values(2);
insert into med(id) values(3);
insert into med(id) values(4);
insert into med(id) values(5);
insert into med(id) values(6);
select (MIN(count)+MAX(count))/2 from
(select case when (select count(*) from
med A where A.id<B.id)=(select count(*)/2 from med) OR
(select count(*) from med A where A.id>B.id)=(select count(*)/2
from med) then cast(B.id as float)end as count from med B) C;
?column?
----------
3.5
(1 row)
You need cast to convert to float, otherwise you would get 3 here.
Upvotes: 0
Reputation: 349989
You could do it like this:
SELECT (MIN(v)+MAX(v))/2
FROM (
SELECT CASE WHEN
LEAST((SELECT COUNT(*) FROM tbl WHERE v <= d1.v),
(SELECT COUNT(*) FROM tbl WHERE v >= d1.v))
>= (SELECT COUNT(*)/2 FROM tbl)
THEN v
END as v
FROM tbl d1
) as sub
The inner query is guaranteed to return 1 or 2 distinct non-null
values among potentially many null
values. The non-null
values may repeat, but by taking the minimum and maximum those two values can be used to calculate the median.
NB: Don't name your table Values
: it is a reserved word.
Upvotes: 2