Darin Hawley
Darin Hawley

Reputation: 29

Return Rank In SQL Query Based On Multiple Columns

I have a large SQL table called 'allscores' similar to the following:

user    score    quiz_id   high_score
Bob     90       math      1
John    80       math      0
John    85       math      1
Steve   100      math      1
Bob     95       reading   0
Bob     100      reading   1
John    80       reading   1

The 'high_score' field is in the table to begin with and is always set to '1' for the row where a user's score is the highest for them for that quiz.

What I want is a SQL query that I can run on an individual user to pull their highest score from each of the two quizzes ('math' and 'reading') along with their overall rank among scores for that quiz. What I have so far is the following:

SELECT `user`, `score`, `quiz_id` FROM `allscores` WHERE `user`="Bob" AND `high_score`="1"

Which will output the following:

user    score    quiz_id
Bob     90       math
Bob     100      reading

This query is simply pulling the highest score for Bob from each quiz - what I want to add is the ranking of the score among the scores in that particular quiz - so an output like the following:

user    score    quiz_id   rank
Bob     90       math      2
Bob     100      reading   1

Bob's rank is '2' for the math quiz as Steve has a higher score, but he is ranked '1' for reading as he has the highest score.

How would I add this ranking column to my existing query?

Upvotes: 2

Views: 8483

Answers (1)

Shawn
Shawn

Reputation: 4786

This uses MS T-SQL syntax, but if your flavor of SQL uses window functions, it should be similar.

SQL Fiddle

MS SQL Server 2017 Schema Setup:

CREATE TABLE t ( 
    [user] varchar(10)
  , score int
  , quiz_id varchar(10)
  , high_score bit
) ;

INSERT INTO t ([user], score, quiz_id, high_score)
VALUES
  ( 'Bob',90,'math',1 ) 
, ( 'John',80,'math',0 )
, ( 'Steve',100,'math',1 )
, ( 'Bob',95,'reading',0 )
, ( 'Bob',100,'reading',1 )
, ( 'John',85,'math',1 )
, ( 'John',80,'reading',1 )
;

MAIN QUERY:

SELECT s1.[user]
  , s1.score
  , s1.quiz_id
  , s1.high_score
  --, s1.isUserHighScore
  , s1.ranking
FROM (

  SELECT 
      t.[user]
    , t.score
    , t.quiz_id
    , t.high_score
    --, ROW_NUMBER() OVER (PARTITION BY t.[user],t.quiz_id ORDER BY t.score DESC) AS isUserHighScore
    , DENSE_RANK() OVER (PARTITION BY t.quiz_id ORDER BY t.score DESC ) AS ranking
  FROM t

) s1

WHERE s1.[user]='Bob'
  --AND s1.isUserHighScore = 1
    AND s1.high_score = 1

Results:

| user | score | quiz_id | high_score | isUserHighScore | ranking |
|------|-------|---------|------------|-----------------|---------|
|  Bob |    90 |    math |       true |               1 |       2 |
|  Bob |   100 | reading |       true |               1 |       1 |

I use ROW_NUMBER() to determine the highest score in a quiz for a user, then use DENSE_RANK() to figure out the rank of a users score versus others. The difference between DENSE_RANK() and RANK() is essentially that DENSE_RANK() won't leave any gaps in the ranking. Example: 2 people scored a 90 and 1 scored an 80, then with DENSE_RANK() the 90s would both be Rank 1 and the 80 would be Rank 2. With RANK(), the 90s would be Rank 1 and the 80 would be Rank 3.

Upvotes: 2

Related Questions