SLFl
SLFl

Reputation: 313

SQL - Percentiles

I have one table:

country(ID, city, freg, counts, date) 

I want to calculate the 90th percentile of counts in a specific interval of dates ($min and $max).

I've already did the same but with the average (code below):

SELECT
      AVG(counts)
FROM country
WHERE date>= @min AND date < @max
      ;

How can I calculate the 90th percentile instead of the average?

Upvotes: 0

Views: 986

Answers (2)

Strawberry
Strawberry

Reputation: 33945

Finally, something GROUP_CONCAT is good for...

SELECT SUBSTRING_INDEX(
          SUBSTRING_INDEX(
                GROUP_CONCAT(ct.ctdivol ORDER BY ct.ctdivol SEPARATOR ','),',',90/100 * COUNT(*) + 1
                         ),',',-1
                      ) `90th Percentile`
  FROM ct
  JOIN exam e
    ON e.examid = ct.examid
   AND e.date BETWEEN @min AND @max
 WHERE e.modality = 'ct';

Upvotes: 2

GigiSan
GigiSan

Reputation: 1282

It appears doing it with a single query is not possible. At least not in MySQL.

You can do it in multiple queries:

1) Select how many rows satisfy your condition.

SELECT
      COUNT(*)
FROM exam
INNER JOIN ct on exam.examID = ct.examID AND ct.ctdivol_mGy > 0
WHERE exam.modality = 'CT'
      AND exam.date >= @min AND exam.date < @max

2) Check the percentile threshold by multiplying the number of rows by percentile/100. For example:

Number of rows in previous count: 200
Percentile: 90%
Number of rows to threshold: 200 * (90/100) = 180

3) Repeat the query, order by the value you want the percentile from and LIMIT the result to the only row number you found in the 2nd point. Like so:

SELECT
      ct.ctdivol_mGy
FROM exam
INNER JOIN ct on exam.examID = ct.examID AND ct.ctdivol_mGy > 0
WHERE exam.modality = 'CT'
      AND exam.date >= @min AND exam.date < @max
ORDER BY ct.ctdivol_mGy
LIMIT 1 OFFSET 179 --> Take 1 row after 179 rows, so our 180th we need

You'll get the 180th value of the selected rows, so the 90th percentile you need. Hope this helps!

Upvotes: 0

Related Questions