user1772115
user1772115

Reputation: 83

Selecting the minimum of two distinct values

I'm working on a sql selection where I need to get all the smallest records for an ID. My problem is that, while I thought my code was working, it would often return incorrect values for other columns, evidently because of my failure to use having to support my aggregate min function. Here is my code.

SELECT *
FROM example_students
INNER JOIN
(SELECT  id, student, test, subject, MIN(score), semester
FROM example_student_scores
WHERE studentid=94
GROUP BY studentid, test, subject) as scores
ON example_students.id=scores.studentid

This is what I want my code to do.

select from exampletable the minimum score for each distinct test and subject combination where the student has id#94

Here are two (heavily modified) tables for the example (I changed all the column names here and in my code too.

example_students
id    name
----+-----------+
94    Bob
1023  Thomas

.

example_students_scores
id    studentId     test       subject     score       semester
----+-----------+-----------+-----------+-----------+-----------

0    94          quiz        math        46          fall
1    94          quiz        math        71          fall
2    94          quiz        math        63          winter
3    94          midterm     math        94          winter
4    94          midterm     science     72          fall
5    94          quiz        math        50          spring
6    94          final       math        76          spring
7    1023        quiz        math        6           spring
8    1023        quiz        math        52          winter
9    1023        quiz        science     68          fall
..*

and the results should be

results
id    studentId     test       subject     score       semester
----+-----------+-----------+-----------+-----------+-----------
0    94          quiz        math        46          fall
3    94          midterm     math        94          winter
4    94          midterm     science     72          fall
6    94          final       math        76          spring

The problem is, I'll get the wrong values for the semester column (and all the other columns I have in my live work).

Given that this has taken me a long time to get no-where with, here is sql to create the two example databases:

drop table if exists example_students;
drop table if exists example_students_scores;
create table example_students(
    id int(10) primary key,
    name char(25)
);
create table example_students_scores(
    id int(10) not null,
    studentId int(10) not null,
    test char(20),
    subject char(20),
    score int(10) not null default '0',
    semester char(20),
    primary key (id),
    index studentid (studentid)
);
insert into example_students values ('94','Bob');
insert into example_students values ('1023','Thomas');
insert into example_students_scores values ('0'    ,'94'          ,'quiz'        ,'math'        ,'46'          ,'fall');
insert into example_students_scores values ('1'    ,'94'          ,'quiz'        ,'math'        ,'71'          ,'fall');
insert into example_students_scores values ('2'    ,'94'          ,'quiz'        ,'math'        ,'63'          ,'winter');
insert into example_students_scores values ('3'    ,'94'          ,'midterm'     ,'math'        ,'94'          ,'winter');
insert into example_students_scores values ('4'    ,'94'          ,'midterm'     ,'science'     ,'72'          ,'fall');
insert into example_students_scores values ('5'    ,'94'          ,'quiz'        ,'math'        ,'50'          ,'spring');
insert into example_students_scores values ('6'    ,'94'          ,'final'       ,'math'        ,'76'          ,'spring');
insert into example_students_scores values ('7'    ,'1023'        ,'quiz'        ,'math'        ,'6'           ,'spring');

I'd be grateful for any pointers or tips, it's very embarrassing to only figure out that your work is wrong a week after going live!

Upvotes: 8

Views: 1801

Answers (7)

wildplasser
wildplasser

Reputation: 44240

If two example_student_scores happen to have the same score, you'll need to avoid duplicates in the output. Min() would cause duplicates.

An excellent way to perform the task is to use window functions, and row_number().

If window functions are not available, the following way to simulate the "first-of-a-group" functionality could be:

SELECT * 
FROM example_students es
JOIN example_student_scores ss ON ss.studentid= es.id
WHERE es.id = 94
AND NOT EXISTS (
        SELECT *
        FROM example_student_scores nx
        WHERE nx.studentid = ss.studentid
        AND nx.test = ss.test
        AND nx.subject = ss.subject
        AND ( nx.score < ss.score
                -- tie breaker
                OR ( nx.score == ss.score AND nx.id < ss.id)
                )
        )
        ;

Upvotes: 1

Conrad Frix
Conrad Frix

Reputation: 52645

In MySQL when you include a field in the select and not in the group by you get an arbitrary value. In this case you can't simply include the semester and test ID in the group by because you won't get the results you want.

In order to accomplish this you must find the min score for each student, test, subject and then join it back on to the orginal table

SELECT * 
FROM   example_students_scores ess 
       INNER JOIN (SELECT studentid, 
                          test, 
                          subject, 
                          Min(score) score 
                   FROM   example_students_scores 
                   WHERE  studentid = 94 
                   GROUP  BY studentid, 
                             test, 
                             subject) scores 
               ON ess.studentid = scores.studentid 
                  AND ess.test = scores.test 
                  AND ess.subject = scores.subject 
                  AND ess.score = scores.score 

SQL Fiddle Demo

Another less conventional method is to self anti join on an inequality.

SELECT 
    s.*
FROM
    example_students_scores s
    LEFT JOIN example_students_scores s2
    ON s.studentID = s2.studentID
      AND s.test = s2.test
      AND s.subject = s2.subject
      AND s.score > s2.score
WHERE 
     s.studentid = 94 
     AND 
     s2.score is null

SQL Fiddle Demo

If you're interested in creating a tie breaker just add an or condition

SELECT 
    s.*
FROM
    example_students_scores s
    LEFT JOIN example_students_scores s2
    ON s.studentID = s2.studentID
      AND s.test = s2.test
      AND s.subject = s2.subject
      AND (s.score > s2.score
           or s.id > s2.id ) -- Added for tie breaker

WHERE 
     s.studentid = 94 
     AND 
     s2.score is null

SQL Fiddle Demo Note: I modified the data to include a tie situation

Upvotes: 4

moj
moj

Reputation: 454

with correlated subquery

select 
   * 
from 
   example_students_scores s1 
where 
   studentId = 94 and 
   score  = (select 
                 min(score)
             from 
                 example_students_scores s2 
             where
                 s2.studentId = s1.studentId and 
                 s2.subject = s1.subject and 
                 s2.test = s1.test 
             group by 
                 studentId, test, subject
            )

Demo with SQL Fiddle

Upvotes: 1

Robert
Robert

Reputation: 25753

Try this solution:

select es.name, e.studentid, e.test, e.subject, e.score as MinScore, e.semester 
from  example_students_scores e
join (
       select studentid, test, subject, min(score) as score
       from example_students_scores
        group by studentid, test, subject) e2 on  e.studentid=e2.studentid
                                              and e.test=e2.test 
                                              and e.subject=e2.subject
                                              and e.score=e2.score
join example_students es on e.studentid = es.id

SQL Fiddle Demo

Upvotes: 1

Yannick Chaze
Yannick Chaze

Reputation: 576

Just SELECT and GROUP BY the columns you need :

SELECT *
FROM example_students
INNER JOIN
(SELECT studentid, test, subject, MIN(score), semester
FROM example_student_scores
GROUP BY studentid, test, subject,semester) scores
ON example_students.id=scores.studentid
WHERE studentid=94

When you use an aggregation function on a column, all the other attributes in the SELECT clause must appear in the GROUP BY clause.

The rest of the query sounds good for me.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269643

Does the following query do what you want?

  SELECT  test, subject, MIN(score) as minscore
  FROM example_student_scores
  WHERE studentid=94
  GROUP BY test, subject

Upvotes: 0

Taryn
Taryn

Reputation: 247690

This should work for you:

select ss2.id score_id,
  ss2.studentid,
  ss1.test,
  ss2.subject,
  ss1.score,
  ss2.semester
from example_students st
left join
(
  select min(score) score, test, subject, studentid
  from example_students_scores
  group by test, studentid, subject
) ss1
  on st.id = ss1.studentid
left join example_students_scores ss2
  on st.id = ss2.studentid
  and ss1.score = ss2.score
  and ss1.test = ss2.test
where st.id = 94
order by ss2.id

See SQL Fiddle with Demo

Upvotes: 2

Related Questions