aasthetic
aasthetic

Reputation: 335

SQL , Postgres: How to get percentile from non-normal distribution?

I have a bin of distribution of 100 marks , each bin has different distributions based on grading schema

Now I have a table with subject ID and array column of number of students who scored in the respective bins like this:

<26 <41 <61 <71 <81 <91 <101

4   15  11  14  18  19  28  

How to get the 25th, 50th, 75th and 95th percentile scores from this distribution in Postgres? The distribution may not be normal.

Is it possible to aggregate this in timescaledb (which is where I need to put this actually), since it doesn't support continuous aggregates as of yet?

Upvotes: 0

Views: 387

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270463

You can unpivot the data and extract information that way. Of course, you have insufficient information to really regenerate the original data. The following just uses one less than the bin max:

select generate_series(1, num, 1) as n, v.val
from t cross join lateral
     (values ("<26", 25), ("<41", 40), . . .  ) v(num, val);

Then you can run functions on this, such as:

select percentile_cont(0.5) within group (order by val)
from (select generate_series(1, num, 1) as n, v.val
      from t cross join lateral
           (values ("<26", 25), ("<41", 40), . . .  ) v(num, val)
     ) gs;

You would be much better off with the original data at the student level. I assume you already know that.

Upvotes: 1

Related Questions