Reputation: 29
I have a large SQL table called 'allscores' similar to the following:
user score quiz_id high_score
Bob 90 math 1
John 80 math 0
John 85 math 1
Steve 100 math 1
Bob 95 reading 0
Bob 100 reading 1
John 80 reading 1
The 'high_score' field is in the table to begin with and is always set to '1' for the row where a user's score is the highest for them for that quiz.
What I want is a SQL query that I can run on an individual user to pull their highest score from each of the two quizzes ('math' and 'reading') along with their overall rank among scores for that quiz. What I have so far is the following:
SELECT `user`, `score`, `quiz_id` FROM `allscores` WHERE `user`="Bob" AND `high_score`="1"
Which will output the following:
user score quiz_id
Bob 90 math
Bob 100 reading
This query is simply pulling the highest score for Bob from each quiz - what I want to add is the ranking of the score among the scores in that particular quiz - so an output like the following:
user score quiz_id rank
Bob 90 math 2
Bob 100 reading 1
Bob's rank is '2' for the math quiz as Steve has a higher score, but he is ranked '1' for reading as he has the highest score.
How would I add this ranking column to my existing query?
Upvotes: 2
Views: 8483
Reputation: 4786
This uses MS T-SQL syntax, but if your flavor of SQL uses window functions, it should be similar.
MS SQL Server 2017 Schema Setup:
CREATE TABLE t (
[user] varchar(10)
, score int
, quiz_id varchar(10)
, high_score bit
) ;
INSERT INTO t ([user], score, quiz_id, high_score)
VALUES
( 'Bob',90,'math',1 )
, ( 'John',80,'math',0 )
, ( 'Steve',100,'math',1 )
, ( 'Bob',95,'reading',0 )
, ( 'Bob',100,'reading',1 )
, ( 'John',85,'math',1 )
, ( 'John',80,'reading',1 )
;
MAIN QUERY:
SELECT s1.[user]
, s1.score
, s1.quiz_id
, s1.high_score
--, s1.isUserHighScore
, s1.ranking
FROM (
SELECT
t.[user]
, t.score
, t.quiz_id
, t.high_score
--, ROW_NUMBER() OVER (PARTITION BY t.[user],t.quiz_id ORDER BY t.score DESC) AS isUserHighScore
, DENSE_RANK() OVER (PARTITION BY t.quiz_id ORDER BY t.score DESC ) AS ranking
FROM t
) s1
WHERE s1.[user]='Bob'
--AND s1.isUserHighScore = 1
AND s1.high_score = 1
| user | score | quiz_id | high_score | isUserHighScore | ranking |
|------|-------|---------|------------|-----------------|---------|
| Bob | 90 | math | true | 1 | 2 |
| Bob | 100 | reading | true | 1 | 1 |
I use ROW_NUMBER()
to determine the highest score in a quiz for a user, then use DENSE_RANK()
to figure out the rank of a users score versus others. The difference between DENSE_RANK()
and RANK()
is essentially that DENSE_RANK()
won't leave any gaps in the ranking. Example: 2 people scored a 90 and 1 scored an 80, then with DENSE_RANK()
the 90s would both be Rank 1 and the 80 would be Rank 2. With RANK()
, the 90s would be Rank 1 and the 80 would be Rank 3.
Upvotes: 2