Tony Harrison
Tony Harrison

Reputation: 19

SQL query to apply multiple conditions against grouped data

Given the tables:

Exam (ExamId, SubjectId)
Result (ExamId, StudentId, GradeId)

What's the best way of retrieving a list of Students who received GradeId='A' in SubjectId='Maths' AND GradeId = 'B' in SubjectId='English' for their most recent exams in each subject? We can either assume ExamIds increase over time, or add an ExamDate column to Exam.

Thanks!

Upvotes: 1

Views: 17505

Answers (2)

MatBailie
MatBailie

Reputation: 86706

First thing you need to do is deal with the "most recent exam" condition.

The following gets the most recent exam for each student by subject...

SELECT
  [Result].StudentID,
  [Exam].SubjectID,
  MAX([Exam].id) AS ExamID
FROM
  Result
     INNER JOIN
  Exam
     ON [Exam].id = [Result].ExamID
GROUP BY
  [Result].StudentID,
  [Exam].SubjectID


You then need to get the grade for each exam, and apply your restrictions...

SELECT
  [Recent].StudentID
FROM
(
  SELECT
    [Result].StudentID,
    [Exam].SubjectID,
    MAX([Exam].id) AS ExamID
  FROM
    Result
       INNER JOIN
    Exam
       ON [Exam].id = [Result].ExamID
  GROUP BY
    [Result].StudentID,
    [Exam].SubjectID
)
  AS [Recent]
INNER JOIN
  Result
    ON  [Result].StudentID = [Recent].StudentID
    AND [Result].ExamID    = [Recent].ExamID
GROUP BY
  [Recent].StudentID
HAVING
  MIN(      
    CASE [Exam].SubjectID
       WHEN 'Maths'   THEN CASE WHEN GradeID = 'A' THEN 1 ELSE 0 END
       WHEN 'English' THEN CASE WHEN GradeID = 'B' THEN 1 ELSE 0 END
       ELSE 1
    END
  )
  = 1

If you know that you only want MATHS and ENGLISH this can be sped up by putting a WHERE clause in the sub query for [Recent]...

  WHERE
    [Exam].Subject IN ('Maths', 'English')

EDIT:

The "MIN(CASE) = 1" part of the query works as follows...
- If the subject is Maths, and they get an A, then 1. Else 0.
- If the subject is English, and they get a B, then 1. Else 0.
- If the subject is anything else, then 1.

If any of these returns 0, ignore that student.

Thus, if a student has Maths:A, English:B, Geography:A, the Geography WON'T cause a 0 and so won't cause the student to be ignored, no matter what their geography grade.

Thinking about it more now though, if they don't have a grade in Maths and/or English, they could still pass this check. If you want to exclude students who have not taken the Maths and/or English tests, use this HAVING clause instead...

  SUM(      
    CASE [Exam].SubjectID
       WHEN 'Maths'   THEN CASE WHEN GradeID = 'A' THEN 1 ELSE 0 END
       WHEN 'English' THEN CASE WHEN GradeID = 'B' THEN 1 ELSE 0 END
       ELSE 0
    END
  )
  = 2

The ELSE 0 ensures other subjects are ignored, and the SUM() = 2 ensures both conditions are matched.

EDIT:

To put the requirements in a table (and speed everything up)...

DECLARE @requirements TABLE (
  SubjectID    NVARCHAR(32),
  GradeID      NCHAR(1)
  )

INSERT INTO @requirements VALUES (N'Maths',   N'A')
INSERT INTO @requirements VALUES (N'English', N'B')

SELECT
  [Recent].StudentID
FROM
(
  SELECT
    [Result].StudentID     AS [StudentID],
    [Exam].SubjectID       AS [SubjectID],
    MAX([Exam].id)         AS [ExamID],
    [Requirements].GradeID AS [RequiredGrade]
  FROM
    Exam
  INNER JOIN
    @requirements [Requirements]
      ON [Requirements].SubjectID = [Exam].SubjectID
  INNER JOIN   
    Result
      ON [Exam].id = [Result].ExamID
  GROUP BY
    [Result].StudentID,
    [Exam].SubjectID,
    [Requirements].GradeID AS RequiredGrade
)
  AS [StudentExam]
INNER JOIN
  Result
    ON  [Result].StudentID = [StudentExam].StudentID
    AND [Result].ExamID    = [StudentExam].ExamID
    AND [Result].GradeID   = [StudentExam].RequiredGrade
GROUP BY
  [Recent].StudentID
HAVING
  COUNT(*) = (SELECT COUNT(*) FROM @requirements)

As mentioned in another post, if you can get an ExamDate in there, that would be more reliable than the ExamID column. It should also be said, provided you have enough control over the database, you should be able to prevent the Identity value doing anything other than going forwards.

Upvotes: 3

Tom H
Tom H

Reputation: 47444

If you get nothing else from this post then let it be this: do NOT assume that the IDs will always increase. That only leads to the Dark Side. Put in an Exam Date (I don't know why you wouldn't have one already, unless your professor just forgot to put it in the homework question) and use that to determine whether or not an exam is before or after another exam.

That said, here are two possible approaches. Neither is tested, so you should test them out and also make sure that you fully understand the solutions. They might be on the exam after all. :)

SELECT
     ENG.StudentID
FROM
     Results ENG_RES
INNER JOIN Exams ENG_EX ON
     ENG_EX.ExamID = ENG_RES.ExamID AND
     ENG_EX.SubjectID = 'English'
INNER JOIN Results MATH_RES ON
     MATH_RES.StudentID = ENG_RES.StudentID AND
     MATH_RES.GradeID = 'A'
INNER JOIN Exams MATH_EX
     MATH_EX.ExamID = MATH_RES.ExamID AND
     MATH_EX.SubjectID = 'Math'
LEFT OUTER JOIN Results MATH_RES2 ON
     MATH_RES2.StudentID = ENG_RES.StudentID
LEFT OUTER JOIN Exams MATH_EX2
     MATH_EX2.ExamID = MATH_RES2.ExamID AND
     MATH_EX2.SubjectID = 'Math' AND
     MATH_EX2.ExamDate > MATH_EX.ExamDate
LEFT OUTER JOIN Results ENG_RES2 ON
     ENG_RES2.StudentID = ENG_RES.StudentID
LEFT OUTER JOIN Exams ENG_EX2
     ENG_EX2.ExamID = ENG_RES2.ExamID AND
     ENG_EX2.SubjectID = 'English' AND
     ENG_EX2.ExamDate > ENG_EX.ExamDate
WHERE
     ENG_RES.GradeID = 'B' AND
     MATH_EX2.ExamID IS NULL AND
     ENG_EX2.ExamID IS NULL

Or alternatively:

SELECT
     ENG.StudentID
FROM
     Results ENG_RES
INNER JOIN Exams ENG_EX ON
     ENG_EX.ExamID = ENG_RES.ExamID AND
     ENG_EX.SubjectID = 'English'
INNER JOIN Results MATH_RES ON
     MATH_RES.StudentID = ENG_RES.StudentID AND
     MATH_RES.GradeID = 'A'
INNER JOIN Exams MATH_EX
     MATH_EX.ExamID = MATH_RES.ExamID AND
     MATH_EX.SubjectID = 'Math'
WHERE
     ENG_RES.GradeID = 'B' AND
     NOT EXISTS
          (
               SELECT *
               FROM
                    Results SQR1
               INNER JOIN Exams SQE1 ON
                    SQE1.ExamID = SQR1.ExamID AND
                    SQE1.SubjectID = 'Math' AND
                    SQE1.ExamDate > MATH_EX.ExamDate
               WHERE
                    SQR1.StudentID = ENG_RES.StudentID
          ) AND
     NOT EXISTS
          (
               SELECT *
               FROM
                    Results SQR2
               INNER JOIN Exams SQE2 ON
                    SQE2.ExamID = SQR2.ExamID AND
                    SQE2.SubjectID = 'English' AND
                    SQE2.ExamDate > ENG_EX.ExamDate
               WHERE
                    SQR2.StudentID = ENG_RES.StudentID
          )

Upvotes: 0

Related Questions