Reputation: 10779
I have a table as following:
[ID] [Name] [Score] [Class]
1 John 90 A
2 Mary 63 A
3 Tom 87 A
4 David 98 A
5 Mary 87 B
6 David 77 B
7 David 73 C
8 Mary 92 C
9 Tom 73 C
10 John 79 C
11 Mary 70 D
12 Jane 85 D
13 David 83 D
I need to get the top 2 persons based on the scores in each class.
My expected output is
[ID] [Name] [Score] [Class]
1 John 90 A
4 David 98 A
5 Mary 87 B
6 David 77 B
8 Mary 92 C
10 John 79 C
12 Jane 85 D
13 David 83 D
Here is what I tried so far but this is not producing the correct results
SELECT *
FROM Student s
WHERE
(
SELECT COUNT(*)
FROM Student f
WHERE f.name = s.name AND
f.score >= s.score
) <= 2
Upvotes: 0
Views: 54
Reputation: 108370
The specification said... "for each class"
The smallest change that would need to be made to the proposed query to get the the specified results "students with the two highest scores for each class", we'd need to replace one of the predicates in the correlated subquery... matching on [class]
rather than on [name]
...
SELECT s.*
FROM Student s
WHERE
(
SELECT COUNT(*)
FROM Student f
WHERE f.class = s.class
AND f.score >= s.score
) <= 2
Note that if there are multiple students in the class with the same "highest scores", the query will return all of those students, not just two students for class.
If we specifically want to return "at most two students from each class, the students who have the highest score in each class", we'd need to write the query a little differently.
Given the example data, with no duplicate scores in a class, the results would be the same.
The difference can be demonstrated by adding a row to the example data, for example, adding Saul, having the same "second highest" score as David.
11 Mary 70 D
12 Jane 85 D
13 David 83 D
14 Saul 83 D
The question we need to ask about the specification... should only two students be returned for this class, Jane
has the highest score, so obviously return Jane
. But David
and Saul
have the same score. Do we return both, or if we only return one of them, does it matter which one we return?
Should we return three rows:
[ID] [Name] [Score] [Class]
12 Jane 85 D
13 David 83 D
14 Saul 83 D
because those are all of the students with the two highest scores, or should we return just two of the students with the highest scores:
[ID] [Name] [Score] [Class]
12 Jane 85 D
13 David 83 D
or
[ID] [Name] [Score] [Class]
12 Jane 85 D
14 Saul 83 D
Once that question is answered, we can write a query that returns the specified result.
And (obviously) this isn't the only query. There are other query patterns that will return an equivalent result... using either ANSI-standard syntax, or vendor specific extensions.
Upvotes: 1
Reputation: 31879
Use ROW_NUMBER
:
SELECT
ID, Name, Score, Class
FROM(
SELECT *,
rn = ROW_NUMBER() OVER(PARTITION BY Class ORDER BY Score DESC)
FROM Student
) t
WHERE rn <= 2
Upvotes: 1