Reputation: 335
I have a bin of distribution of 100 marks , each bin has different distributions based on grading schema
Now I have a table with subject ID and array column of number of students who scored in the respective bins like this:
<26 <41 <61 <71 <81 <91 <101
4 15 11 14 18 19 28
How to get the 25th, 50th, 75th and 95th percentile scores from this distribution in Postgres? The distribution may not be normal.
Is it possible to aggregate this in timescaledb (which is where I need to put this actually), since it doesn't support continuous aggregates as of yet?
Upvotes: 0
Views: 387
Reputation: 1270463
You can unpivot the data and extract information that way. Of course, you have insufficient information to really regenerate the original data. The following just uses one less than the bin max:
select generate_series(1, num, 1) as n, v.val
from t cross join lateral
(values ("<26", 25), ("<41", 40), . . . ) v(num, val);
Then you can run functions on this, such as:
select percentile_cont(0.5) within group (order by val)
from (select generate_series(1, num, 1) as n, v.val
from t cross join lateral
(values ("<26", 25), ("<41", 40), . . . ) v(num, val)
) gs;
You would be much better off with the original data at the student level. I assume you already know that.
Upvotes: 1