Reputation: 183439
The following is the simplest possible example, though any solution should be able to scale to however many n top results are needed:
Given a table like that below, with person, group, and age columns, how would you get the 2 oldest people in each group? (Ties within groups should not yield more results, but give the first 2 in alphabetical order)
+--------+-------+-----+ | Person | Group | Age | +--------+-------+-----+ | Bob | 1 | 32 | | Jill | 1 | 34 | | Shawn | 1 | 42 | | Jake | 2 | 29 | | Paul | 2 | 36 | | Laura | 2 | 39 | +--------+-------+-----+
Desired result set:
+--------+-------+-----+ | Shawn | 1 | 42 | | Jill | 1 | 34 | | Laura | 2 | 39 | | Paul | 2 | 36 | +--------+-------+-----+
NOTE: This question builds on a previous one- Get records with max value for each group of grouped SQL results - for getting a single top row from each group, and which received a great MySQL-specific answer from @Bohemian:
select *
from (select * from mytable order by `Group`, Age desc, Person) x
group by `Group`
Would love to be able to build off this, though I don't see how.
Upvotes: 192
Views: 238247
Reputation: 55
In SQL Server row_number()
is a powerful function that can get result easily as below
select Person,[group],age
from
(
select * ,row_number() over(partition by [group] order by age desc) rn
from mytable
) t
where rn <= 2
Upvotes: 2
Reputation: 621
WITH cte_window AS (
SELECT movie_name,director_id,release_date,
ROW_NUMBER() OVER( PARTITION BY director_id ORDER BY release_date DESC) r
FROM movies
)
SELECT * FROM cte_window WHERE r <= <n>;
Above query will returns latest n movies for each directors.
Upvotes: 10
Reputation: 1823
SELECT
p1.Person,
p1.`GROUP`,
p1.Age
FROM
person AS p1
WHERE
(
SELECT
COUNT( DISTINCT ( p2.age ) )
FROM
person AS p2
WHERE
p2.`GROUP` = p1.`GROUP`
AND p2.Age >= p1.Age
) < 2
ORDER BY
p1.`GROUP` ASC,
p1.age DESC
Upvotes: 0
Reputation: 5389
There is a really nice answer to this problem at MySQL - How To Get Top N Rows per Each Group
Based on the solution in the referenced link, your query would be like:
SELECT Person, Group, Age
FROM
(SELECT Person, Group, Age,
@group_rank := IF(@group = Group, @group_rank + 1, 1) AS group_rank,
@current_group := Group
FROM `your_table`
ORDER BY Group, Age DESC
) ranked
WHERE group_rank <= `n`
ORDER BY Group, Age DESC;
where n
is the top n
and your_table
is the name of your table.
I think the explanation in the reference is really clear. For quick reference I will copy and paste it here:
Currently MySQL does not support ROW_NUMBER() function that can assign a sequence number within a group, but as a workaround we can use MySQL session variables.
These variables do not require declaration, and can be used in a query to do calculations and to store intermediate results.
@current_country := country This code is executed for each row and stores the value of country column to @current_country variable.
@country_rank := IF(@current_country = country, @country_rank + 1, 1) In this code, if @current_country is the same we increment rank, otherwise set it to 1. For the first row @current_country is NULL, so rank is also set to 1.
For correct ranking, we need to have ORDER BY country, population DESC
Upvotes: 1
Reputation: 71
I wanted to share this because I spent a long time searching for an easy way to implement this in a java program I'm working on. This doesn't quite give the output you're looking for but its close. The function in mysql called GROUP_CONCAT()
worked really well for specifying how many results to return in each group. Using LIMIT
or any of the other fancy ways of trying to do this with COUNT
didn't work for me. So if you're willing to accept a modified output, its a great solution. Lets say I have a table called 'student' with student ids, their gender, and gpa. Lets say I want to top 5 gpas for each gender. Then I can write the query like this
SELECT sex, SUBSTRING_INDEX(GROUP_CONCAT(cast(gpa AS char ) ORDER BY gpa desc), ',',5)
AS subcategories FROM student GROUP BY sex;
Note that the parameter '5' tells it how many entries to concatenate into each row
And the output would look something like
+--------+----------------+
| Male | 4,4,4,4,3.9 |
| Female | 4,4,3.9,3.9,3.8|
+--------+----------------+
You can also change the ORDER BY
variable and order them a different way. So if I had the student's age I could replace the 'gpa desc' with 'age desc' and it will work! You can also add variables to the group by statement to get more columns in the output. So this is just a way I found that is pretty flexible and works good if you are ok with just listing results.
Upvotes: 2
Reputation: 487
Snuffin solution seems quite slow to execute when you've got plenty of rows and Mark Byers/Rick James and Bluefeet solutions doesn't work on my environnement (MySQL 5.6) because order by is applied after execution of select, so here is a variant of Marc Byers/Rick James solutions to fix this issue (with an extra imbricated select):
select person, groupname, age
from
(
select person, groupname, age,
(@rn:=if(@prev = groupname, @rn +1, 1)) as rownumb,
@prev:= groupname
from
(
select person, groupname, age
from persons
order by groupname , age desc, person
) as sortedlist
JOIN (select @prev:=NULL, @rn :=0) as vars
) as groupedlist
where rownumb<=2
order by groupname , age desc, person;
I tried similar query on a table having 5 millions rows and it returns result in less than 3 seconds
Upvotes: 19
Reputation: 247630
Here is one way to do this, using UNION ALL
(See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group
number and add queries for each group
:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(@num:=if(@group = `group`, @num +1, if(@group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select @num:=0, @group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
Upvotes: 106
Reputation: 142208
If the other answers are not fast enough Give this code a try:
SELECT
province, n, city, population
FROM
( SELECT @prev := '', @n := 0 ) init
JOIN
( SELECT @n := if(province != @prev, 1, @n + 1) AS n,
@prev := province,
province, city, population
FROM Canada
ORDER BY
province ASC,
population DESC
) x
WHERE n <= 3
ORDER BY province, n;
Output:
+---------------------------+------+------------------+------------+
| province | n | city | population |
+---------------------------+------+------------------+------------+
| Alberta | 1 | Calgary | 968475 |
| Alberta | 2 | Edmonton | 822319 |
| Alberta | 3 | Red Deer | 73595 |
| British Columbia | 1 | Vancouver | 1837970 |
| British Columbia | 2 | Victoria | 289625 |
| British Columbia | 3 | Abbotsford | 151685 |
| Manitoba | 1 | ...
Upvotes: 12
Reputation: 837946
In other databases you can do this using ROW_NUMBER
. MySQL doesn't support ROW_NUMBER
but you can use variables to emulate it:
SELECT
person,
groupname,
age
FROM
(
SELECT
person,
groupname,
age,
@rn := IF(@prev = groupname, @rn + 1, 1) AS rn,
@prev := groupname
FROM mytable
JOIN (SELECT @prev := NULL, @rn := 0) AS vars
ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2
See it working online: sqlfiddle
Edit I just noticed that bluefeet posted a very similar answer: +1 to him. However this answer has two small advantages:
So I'll leave it here in case it can help someone.
Upvotes: 83
Reputation:
How about using self-joining:
CREATE TABLE mytable (person, groupname, age);
INSERT INTO mytable VALUES('Bob',1,32);
INSERT INTO mytable VALUES('Jill',1,34);
INSERT INTO mytable VALUES('Shawn',1,42);
INSERT INTO mytable VALUES('Jake',2,29);
INSERT INTO mytable VALUES('Paul',2,36);
INSERT INTO mytable VALUES('Laura',2,39);
SELECT a.* FROM mytable AS a
LEFT JOIN mytable AS a2
ON a.groupname = a2.groupname AND a.age <= a2.age
GROUP BY a.person
HAVING COUNT(*) <= 2
ORDER BY a.groupname, a.age DESC;
gives me:
a.person a.groupname a.age
---------- ----------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
I was strongly inspired by the answer from Bill Karwin to Select top 10 records for each category
Also, I'm using SQLite, but this should work on MySQL.
Another thing: in the above, I replaced the group
column with a groupname
column for convenience.
Edit:
Following-up on the OP's comment regarding missing tie results, I incremented on snuffin's answer to show all the ties. This means that if the last ones are ties, more than 2 rows can be returned, as shown below:
.headers on
.mode column
CREATE TABLE foo (person, groupname, age);
INSERT INTO foo VALUES('Paul',2,36);
INSERT INTO foo VALUES('Laura',2,39);
INSERT INTO foo VALUES('Joe',2,36);
INSERT INTO foo VALUES('Bob',1,32);
INSERT INTO foo VALUES('Jill',1,34);
INSERT INTO foo VALUES('Shawn',1,42);
INSERT INTO foo VALUES('Jake',2,29);
INSERT INTO foo VALUES('James',2,15);
INSERT INTO foo VALUES('Fred',1,12);
INSERT INTO foo VALUES('Chuck',3,112);
SELECT a.person, a.groupname, a.age
FROM foo AS a
WHERE a.age >= (SELECT MIN(b.age)
FROM foo AS b
WHERE (SELECT COUNT(*)
FROM foo AS c
WHERE c.groupname = b.groupname AND c.age >= b.age) <= 2
GROUP BY b.groupname)
ORDER BY a.groupname ASC, a.age DESC;
gives me:
person groupname age
---------- ---------- ----------
Shawn 1 42
Jill 1 34
Laura 2 39
Paul 2 36
Joe 2 36
Chuck 3 112
Upvotes: 35
Reputation: 2122
Try this:
SELECT a.person, a.group, a.age FROM person AS a WHERE
(SELECT COUNT(*) FROM person AS b
WHERE b.group = a.group AND b.age >= a.age) <= 2
ORDER BY a.group ASC, a.age DESC
Upvotes: 58
Reputation: 14469
Check this out:
SELECT
p.Person,
p.`Group`,
p.Age
FROM
people p
INNER JOIN
(
SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`
UNION
SELECT MAX(p3.Age) AS Age, p3.`Group` FROM people p3 INNER JOIN (SELECT MAX(Age) AS Age, `Group` FROM people GROUP BY `Group`) p4 ON p3.Age < p4.Age AND p3.`Group` = p4.`Group` GROUP BY `Group`
) p2 ON p.Age = p2.Age AND p.`Group` = p2.`Group`
ORDER BY
`Group`,
Age DESC,
Person;
SQL Fiddle: http://sqlfiddle.com/#!2/cdbb6/15
Upvotes: 10