Aadi
Aadi

Reputation: 7109

Rank function in MySQL

I need to find out rank of customers. Here I am adding the corresponding ANSI standard SQL query for my requirement. Please help me to convert it to MySQL .

SELECT RANK() OVER (PARTITION BY Gender ORDER BY Age) AS [Partition by Gender], 
  FirstName, 
  Age,
  Gender 
FROM Person

Is there any function to find out rank in MySQL?

Upvotes: 177

Views: 335453

Answers (14)

Kasthuri Shravankumar
Kasthuri Shravankumar

Reputation: 679

set @insert_rank = -2;

SELECT c_updated_time,
@insert_rank := IF(true = true, @insert_rank + 1, 1) 
from my_table limit 5;

This line will print the select above query;

select @v_query1;

show variables like '%read_only%';
SHOW GRANTS FOR 'USERNAME'@'localhost';

1660616f-f908-4d7e-ab60-c00b77adaf96

select count(*) from INFORMATION_SCHEMA.COLUMNS;
SHOW DATABASES WHERE `Database` IN ('information_schema', 'mysql', 'performance_schema');
use INFORMATION_SCHEMA;
show tables;
desc INFORMATION_SCHEMA.COLUMNS;
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE COLUMN_NAME LIKE '%columnname%' and TABLE_SCHEMA like '%Database%' ORDER BY TABLE_NAME ,COLUMN_NAME;
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'TableName' and TABLE_SCHEMA <> 'test' ORDER BY TABLE_NAME ,COLUMN_NAME;

For Date column

SELECT CURRENT_DATE() AS current_date;

For timestamp column

select now();

Upvotes: 0

Lukas Eder
Lukas Eder

Reputation: 220762

MySQL 5.7

In MySQL 5.7, you can use JSON and local variables as follows, to emulate RANK() OVER (PARTITION BY ..):

SELECT
  FirstName, Age, Gender,
  coalesce(
    json_extract(
      @rn := json_set(@rn, 
        @rnpath := concat('$."rn-', Gender, '"'), 
        @currn := coalesce(json_extract(@rn, @rnpath), 0) + 1,
        @prevpath := concat('$."pre-v-', Gender, '"'),
        Age,
        @prernpath := concat('$."pre-rn-', Gender, '"'),
        IF (json_extract(@rn, @prevpath) = Age, 
          coalesce(json_extract(@rn, @prernpath), @currn) div 1,
          @currn
        )
      ), 
      @prernpath
    ), 
    @currn
  ) AS rn2
FROM Person, (SELECT @rn := '{}') r
ORDER BY Age DESC;

Per partition key, this stores the ROW_NUMBER() ("rn-x"), previous value according to the sort key ("pre-v-x"), as well as the previous ROW_NUMBER() ("pre-rn-x") for tied rows, to emulate RANK() behaviour. The benefit of using JSON here is that the PARTITION BY clause does not influence the ordering of the query.

MySQL 8.0

Starting with MySQL 8, you can finally use window functions also in MySQL: https://dev.mysql.com/doc/refman/8.0/en/window-functions.html

Your query can be written exactly the same way:

SELECT RANK() OVER (PARTITION BY Gender ORDER BY Age) AS `Partition by Gender`, 
  FirstName, 
  Age,
  Gender 
FROM Person

Upvotes: 20

NINSIIMA WILBER
NINSIIMA WILBER

Reputation: 169

In a tweak to Daniel's answer, Some MySQL Versions throw errors at the IF statement

SELECT    first_name,
          age,
          gender,
          (IF(age=@_last_age,@curRank:=@curRank,@curRank:=@_sequence)) AS `rank`,
          @_sequence:=@_sequence+1,@_last_age:=age
FROM      person p, (SELECT @curRank := 1, @_sequence:=1, @_last_age:=0) r
ORDER BY  age;

So I had to wrap the IF condition testing in Ranking with braces

This anly calculates ranking but not Dense_rank

Upvotes: 0

Shallon Kobusinge
Shallon Kobusinge

Reputation: 565

SELECT FirstName,Age,Gender, RANK() OVER (partition by Gender order by Age desc) AS 'Partition by Gender' FROM Person
  • you can use asc/desc depending on which order you want if whether ascending or descending

Upvotes: 0

Aditya
Aditya

Reputation: 13

select id,first_name,gender,age,
rank() over(partition by gender order by age) rank_g
from person

CREATE TABLE person (id int, first_name varchar(20), age int, gender char(1));

INSERT INTO person VALUES (1, 'Bob', 25, 'M');
INSERT INTO person VALUES (2, 'Jane', 20, 'F');
INSERT INTO person VALUES (3, 'Jack', 30, 'M');
INSERT INTO person VALUES (4, 'Bill', 32, 'M');
INSERT INTO person VALUES (5, 'Nick', 22, 'M');
INSERT INTO person VALUES (6, 'Kathy', 18, 'F');
INSERT INTO person VALUES (7, 'Steve', 36, 'M');
INSERT INTO person VALUES (8, 'Anne', 25, 'F');
INSERT INTO person VALUES (9,'AKSH',32,'M');

Upvotes: 0

David Husnian
David Husnian

Reputation: 135

@Sam, your point is excellent in concept but I think you misunderstood what the MySQL docs are saying on the referenced page -- or I misunderstand :-) -- and I just wanted to add this so that if someone feels uncomfortable with the @Daniel's answer they'll be more reassured or at least dig a little deeper.

You see the "@curRank := @curRank + 1 AS rank" inside the SELECT is not "one statement", it's one "atomic" part of the statement so it should be safe.

The document you reference goes on to show examples where the same user-defined variable in 2 (atomic) parts of the statement, for example, "SELECT @curRank, @curRank := @curRank + 1 AS rank".

One might argue that @curRank is used twice in @Daniel's answer: (1) the "@curRank := @curRank + 1 AS rank" and (2) the "(SELECT @curRank := 0) r" but since the second usage is part of the FROM clause, I'm pretty sure it is guaranteed to be evaluated first; essentially making it a second, and preceding, statement.

In fact, on that same MySQL docs page you referenced, you'll see the same solution in the comments -- it could be where @Daniel got it from; yeah, I know that it's the comments but it is comments on the official docs page and that does carry some weight.

Upvotes: 6

Salman Arshad
Salman Arshad

Reputation: 272006

Here is a generic solution that assigns dense rank over partition to rows. It uses user variables:

CREATE TABLE person (
    id INT NOT NULL PRIMARY KEY,
    firstname VARCHAR(10),
    gender VARCHAR(1),
    age INT
);

INSERT INTO person (id, firstname, gender, age) VALUES
(1,  'Adams',  'M', 33),
(2,  'Matt',   'M', 31),
(3,  'Grace',  'F', 25),
(4,  'Harry',  'M', 20),
(5,  'Scott',  'M', 30),
(6,  'Sarah',  'F', 30),
(7,  'Tony',   'M', 30),
(8,  'Lucy',   'F', 27),
(9,  'Zoe',    'F', 30),
(10, 'Megan',  'F', 26),
(11, 'Emily',  'F', 20),
(12, 'Peter',  'M', 20),
(13, 'John',   'M', 21),
(14, 'Kate',   'F', 35),
(15, 'James',  'M', 32),
(16, 'Cole',   'M', 25),
(17, 'Dennis', 'M', 27),
(18, 'Smith',  'M', 35),
(19, 'Zack',   'M', 35),
(20, 'Jill',   'F', 25);

SELECT person.*, @rank := CASE
    WHEN @partval = gender AND @rankval = age THEN @rank
    WHEN @partval = gender AND (@rankval := age) IS NOT NULL THEN @rank + 1
    WHEN (@partval := gender) IS NOT NULL AND (@rankval := age) IS NOT NULL THEN 1
END AS rnk
FROM person, (SELECT @rank := NULL, @partval := NULL, @rankval := NULL) AS x
ORDER BY gender, age;

Notice that the variable assignments are placed inside the CASE expression. This (in theory) takes care of order of evaluation issue. The IS NOT NULL is added to handle datatype conversion and short circuiting issues.

PS: It can easily be converted to row number over partition by by removing all conditions that check for tie.

| id | firstname | gender | age | rank |
|----|-----------|--------|-----|------|
| 11 | Emily     | F      | 20  | 1    |
| 20 | Jill      | F      | 25  | 2    |
| 3  | Grace     | F      | 25  | 2    |
| 10 | Megan     | F      | 26  | 3    |
| 8  | Lucy      | F      | 27  | 4    |
| 6  | Sarah     | F      | 30  | 5    |
| 9  | Zoe       | F      | 30  | 5    |
| 14 | Kate      | F      | 35  | 6    |
| 4  | Harry     | M      | 20  | 1    |
| 12 | Peter     | M      | 20  | 1    |
| 13 | John      | M      | 21  | 2    |
| 16 | Cole      | M      | 25  | 3    |
| 17 | Dennis    | M      | 27  | 4    |
| 7  | Tony      | M      | 30  | 5    |
| 5  | Scott     | M      | 30  | 5    |
| 2  | Matt      | M      | 31  | 6    |
| 15 | James     | M      | 32  | 7    |
| 1  | Adams     | M      | 33  | 8    |
| 18 | Smith     | M      | 35  | 9    |
| 19 | Zack      | M      | 35  | 9    |

Demo on db<>fiddle

Upvotes: 60

Max
Max

Reputation: 571

To avoid the "however" in Erandac's answer in combination of Daniel's and Salman's answers, one may use one of the following "partition workarounds"

SELECT customerID, myDate

  -- partition ranking works only with CTE / from MySQL 8.0 on
  , RANK() OVER (PARTITION BY customerID ORDER BY dateFrom) AS rank, 

  -- Erandac's method in combination of Daniel's and Salman's
  -- count all items in sequence, maximum reaches row count.
  , IF(customerID=@_lastRank, @_curRank:=@_curRank, @_curRank:=@_sequence+1) AS sequenceRank
  , @_sequence:=@_sequence+1 as sequenceOverAll

  -- Dense partition ranking, works also with MySQL 5.7
  -- remember to set offset values in from clause
  , IF(customerID=@_lastRank, @_nxtRank:=@_nxtRank, @_nxtRank:=@_nxtRank+1 ) AS partitionRank
  , IF(customerID=@_lastRank, @_overPart:=@_overPart+1, @_overPart:=1 ) AS partitionSequence

  , @_lastRank:=customerID
FROM myCustomers, 
  (SELECT @_curRank:=0, @_sequence:=0, @_lastRank:=0, @_nxtRank:=0, @_overPart:=0 ) r
ORDER BY customerID, myDate

The partition ranking in the 3rd variant in this code snippet will return continous ranking numbers. this will lead to a data structur similar to the rank() over partition by result. As an example, see below. In particular, the partitionSequence will always start with 1 for each new partitionRank, using this method:

customerID    myDate   sequenceRank (Erandac)
                          |    sequenceOverAll
                          |     |   partitionRank
                          |     |     | partitionSequence
                          |     |     |    | lastRank
... lines ommitted for clarity
40    09.11.2016 11:19    1     44    1   44    40
40    09.12.2016 12:08    1     45    1   45    40
40    09.12.2016 12:08    1     46    1   46    40
40    09.12.2016 12:11    1     47    1   47    40
40    09.12.2016 12:12    1     48    1   48    40
40    13.10.2017 16:31    1     49    1   49    40
40    15.10.2017 11:00    1     50    1   50    40
76    01.07.2015 00:24    51    51    2    1    76
77    04.08.2014 13:35    52    52    3    1    77
79    15.04.2015 20:25    53    53    4    1    79
79    24.04.2018 11:44    53    54    4    2    79
79    08.10.2018 17:37    53    55    4    3    79
117   09.07.2014 18:21    56    56    5    1   117
119   26.06.2014 13:55    57    57    6    1   119
119   02.03.2015 10:23    57    58    6    2   119
119   12.10.2015 10:16    57    59    6    3   119
119   08.04.2016 09:32    57    60    6    4   119
119   05.10.2016 12:41    57    61    6    5   119
119   05.10.2016 12:42    57    62    6    6   119
...

Upvotes: 2

Salman Arshad
Salman Arshad

Reputation: 272006

The most straight forward solution to determine the rank of a given value is to count the number of values before it. Suppose we have the following values:

10 20 30 30 30 40
  • All 30 values are considered 3rd
  • All 40 values are considered 6th (rank) or 4th (dense rank)

Now back to the original question. Here is some sample data which is sorted as described in OP (expected ranks are added on the right):

+------+-----------+------+--------+    +------+------------+
| id   | firstname | age  | gender |    | rank | dense_rank |
+------+-----------+------+--------+    +------+------------+
|   11 | Emily     |   20 | F      |    |    1 |          1 |
|    3 | Grace     |   25 | F      |    |    2 |          2 |
|   20 | Jill      |   25 | F      |    |    2 |          2 |
|   10 | Megan     |   26 | F      |    |    4 |          3 |
|    8 | Lucy      |   27 | F      |    |    5 |          4 |
|    6 | Sarah     |   30 | F      |    |    6 |          5 |
|    9 | Zoe       |   30 | F      |    |    6 |          5 |
|   14 | Kate      |   35 | F      |    |    8 |          6 |
|    4 | Harry     |   20 | M      |    |    1 |          1 |
|   12 | Peter     |   20 | M      |    |    1 |          1 |
|   13 | John      |   21 | M      |    |    3 |          2 |
|   16 | Cole      |   25 | M      |    |    4 |          3 |
|   17 | Dennis    |   27 | M      |    |    5 |          4 |
|    5 | Scott     |   30 | M      |    |    6 |          5 |
|    7 | Tony      |   30 | M      |    |    6 |          5 |
|    2 | Matt      |   31 | M      |    |    8 |          6 |
|   15 | James     |   32 | M      |    |    9 |          7 |
|    1 | Adams     |   33 | M      |    |   10 |          8 |
|   18 | Smith     |   35 | M      |    |   11 |          9 |
|   19 | Zack      |   35 | M      |    |   11 |          9 |
+------+-----------+------+--------+    +------+------------+

To calculate RANK() OVER (PARTITION BY Gender ORDER BY Age) for Sarah, you can use this query:

SELECT COUNT(id) + 1 AS rank, COUNT(DISTINCT age) + 1 AS dense_rank
FROM testdata
WHERE gender = (SELECT gender FROM testdata WHERE id = 6)
AND age < (SELECT age FROM testdata WHERE id = 6)

+------+------------+
| rank | dense_rank |
+------+------------+
|    6 |          5 |
+------+------------+

To calculate RANK() OVER (PARTITION BY Gender ORDER BY Age) for All rows you can use this query:

SELECT testdata.id, COUNT(lesser.id) + 1 AS rank, COUNT(DISTINCT lesser.age) + 1 AS dense_rank
FROM testdata
LEFT JOIN testdata AS lesser ON lesser.age < testdata.age AND lesser.gender = testdata.gender
GROUP BY testdata.id

And here is the result (joined values are added on right):

+------+------+------------+    +-----------+-----+--------+
| id   | rank | dense_rank |    | firstname | age | gender |
+------+------+------------+    +-----------+-----+--------+
|   11 |    1 |          1 |    | Emily     |  20 | F      |
|    3 |    2 |          2 |    | Grace     |  25 | F      |
|   20 |    2 |          2 |    | Jill      |  25 | F      |
|   10 |    4 |          3 |    | Megan     |  26 | F      |
|    8 |    5 |          4 |    | Lucy      |  27 | F      |
|    6 |    6 |          5 |    | Sarah     |  30 | F      |
|    9 |    6 |          5 |    | Zoe       |  30 | F      |
|   14 |    8 |          6 |    | Kate      |  35 | F      |
|    4 |    1 |          1 |    | Harry     |  20 | M      |
|   12 |    1 |          1 |    | Peter     |  20 | M      |
|   13 |    3 |          2 |    | John      |  21 | M      |
|   16 |    4 |          3 |    | Cole      |  25 | M      |
|   17 |    5 |          4 |    | Dennis    |  27 | M      |
|    5 |    6 |          5 |    | Scott     |  30 | M      |
|    7 |    6 |          5 |    | Tony      |  30 | M      |
|    2 |    8 |          6 |    | Matt      |  31 | M      |
|   15 |    9 |          7 |    | James     |  32 | M      |
|    1 |   10 |          8 |    | Adams     |  33 | M      |
|   18 |   11 |          9 |    | Smith     |  35 | M      |
|   19 |   11 |          9 |    | Zack      |  35 | M      |
+------+------+------------+    +-----------+-----+--------+

Upvotes: 5

Rahul Agarwal
Rahul Agarwal

Reputation: 935

While the most upvoted answer ranks, it doesn't partition, You can do a self Join to get the whole thing partitioned also:

SELECT    a.first_name,
      a.age,
      a.gender,
        count(b.age)+1 as rank
FROM  person a left join person b on a.age>b.age and a.gender=b.gender 
group by  a.first_name,
      a.age,
      a.gender

Use Case

CREATE TABLE person (id int, first_name varchar(20), age int, gender char(1));

INSERT INTO person VALUES (1, 'Bob', 25, 'M');
INSERT INTO person VALUES (2, 'Jane', 20, 'F');
INSERT INTO person VALUES (3, 'Jack', 30, 'M');
INSERT INTO person VALUES (4, 'Bill', 32, 'M');
INSERT INTO person VALUES (5, 'Nick', 22, 'M');
INSERT INTO person VALUES (6, 'Kathy', 18, 'F');
INSERT INTO person VALUES (7, 'Steve', 36, 'M');
INSERT INTO person VALUES (8, 'Anne', 25, 'F');

Answer:

Bill    32  M   4
Bob     25  M   2
Jack    30  M   3
Nick    22  M   1
Steve   36  M   5
Anne    25  F   3
Jane    20  F   2
Kathy   18  F   1

Upvotes: 59

erandac
erandac

Reputation: 575

Combination of Daniel's and Salman's answer. However the rank will not give as continues sequence with ties exists . Instead it skips the rank to next. So maximum always reach row count.

    SELECT    first_name,
              age,
              gender,
              IF(age=@_last_age,@curRank:=@curRank,@curRank:=@_sequence) AS rank,
              @_sequence:=@_sequence+1,@_last_age:=age
    FROM      person p, (SELECT @curRank := 1, @_sequence:=1, @_last_age:=0) r
    ORDER BY  age;

Schema and Test Case:

CREATE TABLE person (id int, first_name varchar(20), age int, gender char(1));

INSERT INTO person VALUES (1, 'Bob', 25, 'M');
INSERT INTO person VALUES (2, 'Jane', 20, 'F');
INSERT INTO person VALUES (3, 'Jack', 30, 'M');
INSERT INTO person VALUES (4, 'Bill', 32, 'M');
INSERT INTO person VALUES (5, 'Nick', 22, 'M');
INSERT INTO person VALUES (6, 'Kathy', 18, 'F');
INSERT INTO person VALUES (7, 'Steve', 36, 'M');
INSERT INTO person VALUES (8, 'Anne', 25, 'F');
INSERT INTO person VALUES (9, 'Kamal', 25, 'M');
INSERT INTO person VALUES (10, 'Saman', 32, 'M');

Output:

+------------+------+--------+------+--------------------------+-----------------+
| first_name | age  | gender | rank | @_sequence:=@_sequence+1 | @_last_age:=age |
+------------+------+--------+------+--------------------------+-----------------+
| Kathy      |   18 | F      |    1 |                        2 |              18 |
| Jane       |   20 | F      |    2 |                        3 |              20 |
| Nick       |   22 | M      |    3 |                        4 |              22 |
| Kamal      |   25 | M      |    4 |                        5 |              25 |
| Anne       |   25 | F      |    4 |                        6 |              25 |
| Bob        |   25 | M      |    4 |                        7 |              25 |
| Jack       |   30 | M      |    7 |                        8 |              30 |
| Bill       |   32 | M      |    8 |                        9 |              32 |
| Saman      |   32 | M      |    8 |                       10 |              32 |
| Steve      |   36 | M      |   10 |                       11 |              36 |
+------------+------+--------+------+--------------------------+-----------------+

Upvotes: 20

Sam Kidman
Sam Kidman

Reputation: 31

If you want to rank just one person you can do the following:

SELECT COUNT(Age) + 1
 FROM PERSON
WHERE(Age < age_to_rank)

This ranking corresponds to the oracle RANK function (Where if you have people with the same age they get the same rank, and the ranking after that is non-consecutive).

It's a little bit faster than using one of the above solutions in a subquery and selecting from that to get the ranking of one person.

This can be used to rank everyone but it's slower than the above solutions.

SELECT
  Age AS age_var,
(
  SELECT COUNT(Age) + 1
  FROM Person
  WHERE (Age < age_var)
 ) AS rank
 FROM Person

Upvotes: 3

Mukesh Soni
Mukesh Soni

Reputation: 6668

A tweak of Daniel's version to calculate percentile along with rank. Also two people with same marks will get the same rank.

set @totalStudents = 0;
select count(*) into @totalStudents from marksheets;
SELECT id, score, @curRank := IF(@prevVal=score, @curRank, @studentNumber) AS rank, 
@percentile := IF(@prevVal=score, @percentile, (@totalStudents - @studentNumber + 1)/(@totalStudents)*100),
@studentNumber := @studentNumber + 1 as studentNumber, 
@prevVal:=score
FROM marksheets, (
SELECT @curRank :=0, @prevVal:=null, @studentNumber:=1, @percentile:=100
) r
ORDER BY score DESC

Results of the query for a sample data -

+----+-------+------+---------------+---------------+-----------------+
| id | score | rank | percentile    | studentNumber | @prevVal:=score |
+----+-------+------+---------------+---------------+-----------------+
| 10 |    98 |    1 | 100.000000000 |             2 |              98 |
|  5 |    95 |    2 |  90.000000000 |             3 |              95 |
|  6 |    91 |    3 |  80.000000000 |             4 |              91 |
|  2 |    91 |    3 |  80.000000000 |             5 |              91 |
|  8 |    90 |    5 |  60.000000000 |             6 |              90 |
|  1 |    90 |    5 |  60.000000000 |             7 |              90 |
|  9 |    84 |    7 |  40.000000000 |             8 |              84 |
|  3 |    83 |    8 |  30.000000000 |             9 |              83 |
|  4 |    72 |    9 |  20.000000000 |            10 |              72 |
|  7 |    60 |   10 |  10.000000000 |            11 |              60 |
+----+-------+------+---------------+---------------+-----------------+

Upvotes: 25

Daniel Vassallo
Daniel Vassallo

Reputation: 344281

One option is to use a ranking variable, such as the following:

SELECT    first_name,
          age,
          gender,
          @curRank := @curRank + 1 AS rank
FROM      person p, (SELECT @curRank := 0) r
ORDER BY  age;

The (SELECT @curRank := 0) part allows the variable initialization without requiring a separate SET command.

Test case:

CREATE TABLE person (id int, first_name varchar(20), age int, gender char(1));

INSERT INTO person VALUES (1, 'Bob', 25, 'M');
INSERT INTO person VALUES (2, 'Jane', 20, 'F');
INSERT INTO person VALUES (3, 'Jack', 30, 'M');
INSERT INTO person VALUES (4, 'Bill', 32, 'M');
INSERT INTO person VALUES (5, 'Nick', 22, 'M');
INSERT INTO person VALUES (6, 'Kathy', 18, 'F');
INSERT INTO person VALUES (7, 'Steve', 36, 'M');
INSERT INTO person VALUES (8, 'Anne', 25, 'F');

Result:

+------------+------+--------+------+
| first_name | age  | gender | rank |
+------------+------+--------+------+
| Kathy      |   18 | F      |    1 |
| Jane       |   20 | F      |    2 |
| Nick       |   22 | M      |    3 |
| Bob        |   25 | M      |    4 |
| Anne       |   25 | F      |    5 |
| Jack       |   30 | M      |    6 |
| Bill       |   32 | M      |    7 |
| Steve      |   36 | M      |    8 |
+------------+------+--------+------+
8 rows in set (0.02 sec)

Upvotes: 289

Related Questions