Serbu Florin-Adrian
Serbu Florin-Adrian

Reputation: 511

mysql max per group by date

I have two mysql greatest-n-per-group, greatest-by-date problems:

Considering one students table and one grades table, I want to have all students displayed with their most recent grade.

The schema script:

    CREATE TABLE student (
      id int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(255) NOT NULL,
      PRIMARY KEY (id)
    ) ENGINE=InnoDB  DEFAULT CHARSET=latin1;

    INSERT INTO student VALUES(1, 'jim');
    INSERT INTO student VALUES(2, 'mark ');
    INSERT INTO student VALUES(3, 'john');

    CREATE TABLE grades (
      id int(11) NOT NULL AUTO_INCREMENT,
      student_id int(11) NOT NULL,
      grade int(11) NOT NULL,
      `date` date DEFAULT NULL,
      PRIMARY KEY (id)
    ) ENGINE=InnoDB  DEFAULT CHARSET=latin1;


    INSERT INTO grades VALUES(1, 1, 6, NULL);
    INSERT INTO grades VALUES(2, 1, 8, NULL);
    INSERT INTO grades VALUES(3, 1, 10, NULL);
    INSERT INTO grades VALUES(4, 2, 9, '2016-05-10');
    INSERT INTO grades VALUES(5, 2, 8, NULL);
    INSERT INTO grades VALUES(6, 3, 6, '2016-05-26');
    INSERT INTO grades VALUES(7, 3, 7, '2016-05-27');

A) I want to find out if this is a valid solution for getting the most recent record by a date field (date) from a secondary table (grades) grouped for each row in a main table (student).

My query is:

    SELECT s.id, s.name, g.grade, g.date
    FROM student AS s
    LEFT JOIN (
        SELECT student_id, grade, DATE
        FROM grades AS gr
        WHERE DATE = (
            SELECT MAX(DATE)
            FROM grades
            WHERE student_id = gr.student_id
        )
        GROUP BY student_id
    ) AS g ON s.id = g.student_id

Sql Fiddle: http://sqlfiddle.com/#!9/a84171/2

This query displays the desired (almost) results. But I have doubts that this is the best approach because it looks ugly, so I am very curious about the alternatives.

B) The second problem is the reason for the (almost) above,

For the first row, name=Jim it finds no grade though we have grades for Jim. So just in case the query above would be valid only for NOT NULL date fields. The question would be:

How to get the most recent grade for all the students who have grades, including Jim even that his grades has no date specified (NULL). In this case the most recent grouping will be given by the latest row inserted (MAX(id)) or just random.

Doesn't work with replacing date = (SELECT... with date IN (SELECT ....

Any help would be much appreciated,

Thanks!

[UPDATE #1]:

For B) I found adding this to the sub-query, OR date IS NULL, produces the desired result:

    SELECT s.id, s.name, g.grade, g.date
    FROM student AS s
    LEFT JOIN (
        SELECT id, student_id, grade, DATE
        FROM grades AS gr
        WHERE DATE = (
            SELECT MAX(DATE)
            FROM grades
            WHERE student_id = gr.student_id
        ) OR date IS NULL
        GROUP BY student_id
    ) AS g ON s.id = g.student_id

[UPDATE #2]

Seems the previous update worked if the first grade has a date for a student. It doesn't if the first grade is null. I would have linked a fiddle but it seems sqlfiddle doesn't work now.

So this is what I came up until now that seems to solve the B) problem:

    SELECT s.id, s.name, g.grade, g.date
    FROM student AS s
    LEFT JOIN (
        SELECT id, student_id, grade, DATE
        FROM grades AS gr
        WHERE (
            `date` = (
                SELECT MAX(DATE)
                FROM grades
                WHERE student_id = gr.student_id
            )
        ) OR (
            (
                SELECT MAX(DATE)
                FROM grades
                WHERE student_id = gr.student_id
            ) IS NULL AND 
            date IS NULL
        )   

    ) AS g ON s.id = g.student_id
    GROUP BY student_id

I still would like to know if you guys know better alternatives to this ugly thing.

Thanks!

[UPDATE #3]

@Strawberry The desired results would be:

    id  name    grade   date
    1   jim     10  NULL
    2   mark    9   2016-05-10
    3   john    7   2016-05-27

Upvotes: 2

Views: 238

Answers (2)

Alex
Alex

Reputation: 17289

http://sqlfiddle.com/#!9/ecec43/4

SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
  SELECT gr.student_id, gr.grade, gr.DATE
  FROM grades AS gr
  LEFT JOIN grades grm
  ON grm.student_id = gr.student_id 
    AND grm.date>gr.date
  WHERE grm.student_id IS NULL
    AND gr.date IS NOT NULL
  GROUP BY gr.student_id
) AS g 
ON s.id = g.student_id;

Upvotes: 0

Strawberry
Strawberry

Reputation: 33945

The complexity of this problem stems from the logical impossibility of a grade without an associated date, so obviously the solution is to fix that.

But here's a workaround...

E.g.:

SELECT a.* 
  FROM grades a 
  JOIN 
     ( SELECT student_id
            , MAX(COALESCE(UNIX_TIMESTAMP(date),id)) date 
         FROM grades 
        GROUP 
           BY student_id
     ) b 
    ON b.student_id = a.student_id 
   AND b.date = COALESCE(UNIX_TIMESTAMP(a.date),id);

Upvotes: 1

Related Questions