Reputation: 511
I have two mysql greatest-n-per-group, greatest-by-date problems:
Considering one students table and one grades table, I want to have all students displayed with their most recent grade.
The schema script:
CREATE TABLE student (
id int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO student VALUES(1, 'jim');
INSERT INTO student VALUES(2, 'mark ');
INSERT INTO student VALUES(3, 'john');
CREATE TABLE grades (
id int(11) NOT NULL AUTO_INCREMENT,
student_id int(11) NOT NULL,
grade int(11) NOT NULL,
`date` date DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO grades VALUES(1, 1, 6, NULL);
INSERT INTO grades VALUES(2, 1, 8, NULL);
INSERT INTO grades VALUES(3, 1, 10, NULL);
INSERT INTO grades VALUES(4, 2, 9, '2016-05-10');
INSERT INTO grades VALUES(5, 2, 8, NULL);
INSERT INTO grades VALUES(6, 3, 6, '2016-05-26');
INSERT INTO grades VALUES(7, 3, 7, '2016-05-27');
A) I want to find out if this is a valid solution for getting the most recent record by a date field (date
) from a secondary table (grades
) grouped for each row in a main table (student
).
My query is:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT student_id, grade, DATE
FROM grades AS gr
WHERE DATE = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
)
GROUP BY student_id
) AS g ON s.id = g.student_id
Sql Fiddle: http://sqlfiddle.com/#!9/a84171/2
This query displays the desired (almost) results. But I have doubts that this is the best approach because it looks ugly, so I am very curious about the alternatives.
B) The second problem is the reason for the (almost)
above,
For the first row, name=Jim
it finds no grade though we have grades for Jim.
So just in case the query above would be valid only for NOT NULL date fields.
The question would be:
How to get the most recent grade for all the students who have grades, including Jim even that his grades has no date specified (NULL). In this case the most recent grouping will be given by the latest row inserted (MAX(id)) or just random.
Doesn't work with replacing date = (SELECT...
with date IN (SELECT ...
.
Any help would be much appreciated,
Thanks!
[UPDATE #1]:
For B) I found adding this to the sub-query, OR date IS NULL
, produces the desired result:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT id, student_id, grade, DATE
FROM grades AS gr
WHERE DATE = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
) OR date IS NULL
GROUP BY student_id
) AS g ON s.id = g.student_id
[UPDATE #2]
Seems the previous update worked if the first grade has a date for a student. It doesn't if the first grade is null. I would have linked a fiddle but it seems sqlfiddle doesn't work now.
So this is what I came up until now that seems to solve the B) problem:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT id, student_id, grade, DATE
FROM grades AS gr
WHERE (
`date` = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
)
) OR (
(
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
) IS NULL AND
date IS NULL
)
) AS g ON s.id = g.student_id
GROUP BY student_id
I still would like to know if you guys know better alternatives to this ugly thing.
Thanks!
[UPDATE #3]
@Strawberry The desired results would be:
id name grade date
1 jim 10 NULL
2 mark 9 2016-05-10
3 john 7 2016-05-27
Upvotes: 2
Views: 238
Reputation: 17289
http://sqlfiddle.com/#!9/ecec43/4
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT gr.student_id, gr.grade, gr.DATE
FROM grades AS gr
LEFT JOIN grades grm
ON grm.student_id = gr.student_id
AND grm.date>gr.date
WHERE grm.student_id IS NULL
AND gr.date IS NOT NULL
GROUP BY gr.student_id
) AS g
ON s.id = g.student_id;
Upvotes: 0
Reputation: 33945
The complexity of this problem stems from the logical impossibility of a grade without an associated date, so obviously the solution is to fix that.
But here's a workaround...
E.g.:
SELECT a.*
FROM grades a
JOIN
( SELECT student_id
, MAX(COALESCE(UNIX_TIMESTAMP(date),id)) date
FROM grades
GROUP
BY student_id
) b
ON b.student_id = a.student_id
AND b.date = COALESCE(UNIX_TIMESTAMP(a.date),id);
Upvotes: 1