erjiang
erjiang

Reputation: 45727

Select nth percentile from MySQL

I have a simple table of data, and I'd like to select the row that's at about the 40th percentile from the query.

I can do this right now by first querying to find the number of rows and then running another query that sorts and selects the nth row:

select count(*) as `total` from mydata;

which may return something like 93, 93*0.4 = 37

select * from mydata order by `field` asc limit 37,1;

Can I combine these two queries into a single query?

Upvotes: 4

Views: 6486

Answers (3)

Chris
Chris

Reputation: 7288

There's also this solution, which uses a monster string made by GROUP_CONCAT. I had to up the max on the output like so to get it to work:

SET SESSION group_concat_max_len = 1000000;

MySql wizards out there: feel free to comment on the relative performance of the methods.

Upvotes: 0

mdma
mdma

Reputation: 57757

This will give you approximately the 40th percentile, it returns the row where 40% of rows are less than it. It sorts rows by how far they are from the 40th percentile, since no row may fall exactly on the 40th percentile.

SELECT m1.field, m1.otherfield, count(m2.field) 
  FROM mydata m1 INNER JOIN mydata m2 ON m2.field<m1.field
GROUP BY 
   m1.field,m1.otherfield
ORDER BY 
   ABS(0.4-(count(m2.field)/(select count(*) from mydata)))
LIMIT 1

Upvotes: 1

Wrikken
Wrikken

Reputation: 70500

As an exercise in futility (your current solition would probably be faster and prefered), if the table is MYISAM (or you can live with the approximation of InnoDB):

SET @row =0;
SELECT x.*
FROM information_schema.tables
JOIN (
  SELECT @row := @row+1 as 'row',mydata.*
  FROM mydata
  ORDER BY field ASC
) x
ON x.row = round(information_schema.tables.table_rows * 0.4)
WHERE information_schema.tables.table_schema = database()
AND information_schema.tables.table_name = 'mydata';

Upvotes: 0

Related Questions