Reputation: 504
I've recently started to learn tsql beyond basic inserts and selects, I have test database that I train on, and there is one query that I can't really get to work.
There are 3 tables used in that query, in the picture there are simplified fields and relations
I have 2 following queries - first one is simply displaying students and number of marks from each subject. Second is doing almost what I want to achive - shows students and maxiumum amount of marks they got, so ex. subject1 - (marks) 1, 5, 3, 4 count - 4 subject2 - (marks) 5, 4, 5 - count - 3 Query shows 4 and from what I checked it returns correct results, but I want one more thing - just to show the name of the subject from which there is maximum amount of marks so in the example case - subject1
--Query 1--
SELECT s.Surname, subj.SubjectName, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE m.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName
ORDER BY s.Surname
--Query 2--
SELECT query.Surname, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname
ORDER BY query.Surname
--Query 3 - not working as supposed--
SELECT query.Surname, query.SubjectName, MAX(Marks_count) as Maximum_marks_count FROM (SELECT s.Surname, subj.SubjectNumber, COUNT(m.Mark) as Marks_count
FROM marks m, students s, subjects subj
WHERE marks.StudentId = s.StudentNumber and subj.SubjectNumber = m.SubjectId
GROUP BY s.Surname, subj.SubjectName) as query
GROUP BY query.Surname, query.SubjectName
ORDER BY query.Surname
Part of the query 1 result
Part of the query 2 and unfortunately query 3 result
The problem is that when I add to the select statement subject name I got results as from query one - there is no more maximum amount of marks just students, subjects and amount of marks from each subject.
If someone could say what I'm missing, I will much appreciate :)
Upvotes: 0
Views: 131
Reputation: 74605
Here's a query that gets the highest mark per student, put it at the top of your sql file/batch and it will make another "table" you can join to your other tables to get the student name and the subject name:
WITH studentBest as
SELECT * FROM(
SELECT *, ROW_NUMBER() OVER(PARTITION BY studentid ORDER BY mark DESC) rown
FROM marks) a
WHERE rown = 1)
You use it like this (for example)
--the WITH bit goes above this line
SELECT *
FROM
studentBest sb
INNER JOIN
subject s
ON sb.subjectid = s.subjectnumber
Etc
That's also how you should be doing your joins
How does it work? Well.. it establishes an incrementing counter that restarts every time studentid changes (the partition clause) and the numberin goes in des ending mark order (the order by clause). An outer query selects only those rows with 1 in the row number, ie the top mark per student
Why can't I use group by?
You can, but you have to write a query that summarises the marks table into the top mark (max) per student and then you have to join that data back to the mark table to retrieve the subject and all in it's a lot more faff, often less efficient
What if there are two subjects with the same mark?
Use RANK instead of ROW_NUMBER if you want to see both
Edit in response to your comment:
An extension of the above method:
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY su, st ORDER BY c DESC) rn FROM
(
SELECT studentid st, subjectid su, count(*) c
FROM marks
GROUP BY st, su
) a
) b
INNER JOIN student stu on b.st = stu.studentnumber
INNER JOIN subject sub on b.su = sub.subjectnumber
WHERE
b.rn = 1
We count the marks by student/subject, then rownumber them in descending order of count per student-subject pair, then choose only the first row and join in the other wanted data
Upvotes: 1
Reputation: 504
Ok thanks to Caius Jard, some other Stack's question and a little bit of experiments I managed to write working query, so this is how I did it.
First I created view from query1 and added one more column to it - studentId. Then I wrote query which almost satisfied me. That question helped me a lot with that task: Question
SELECT marks.Surname,
marks.SubjectName,
marks.Marks_count,
ROW_NUMBER() OVER(PARTITION BY marks.Surname ORDER BY marks.Surname) as RowNum
FROM MarksAmountPerStudentAndSubject marks
INNER JOIN (SELECT MarksAmountPerStudentAndSubject.Id,
MAX(MarksAmountPerStudentAndSubject.Marks_count) as MaxAmount
FROM MarksAmountPerStudentAndSubject
GROUP BY MarksAmountPerStudentAndSubject.Id) m
ON m.Id = marks.Id and marks.Marks_count = m.MaxAmount
It gives following results
That's what I wanted to achieve with one exception - if students have the same amount of marks from multiple subjects it displays all of them - thats fine but I decided to restrict this to the first result for each student - I couldn't just simply put TOP(1) there so I used similar solution that Caius Jard showed - ROW_NUMBER and window function - it gave me a chance to choose records that has row number equals to 1. I created another view from this query and I could simply write the final one
SELECT marks.Surname, marks.SubjectName, marks.Marks_count
FROM StudentsMaxMarksAmount marks
WHERE marks.RowNum = 1
ORDER BY marks.Surname
With result
Upvotes: 0