Reputation: 1208
I've been regressively testing my bulletin board I recently developed. I came across a small bug in my SQL query.
SELECT `user`.avatar, `user`.username, `user`.id AS user_id, `post`.title,
`post`.id, `post`.date AS post_date, `comment`.date AS last_comment_date
FROM `post`
INNER JOIN `user` ON `post`.user_id = `user`.id
LEFT JOIN `comment` ON `post`.id = `comment`.post_id
WHERE `post`.category_id = 1
ORDER BY IFNULL(`last_comment_date`, `post_date`) DESC;
Will produce something like;
I happen to know the cause of this. MySQL joins all the comments with their post, in other words, its a one to many relationship, a post can have many comments. This creates issues with pagination as the first ten posts could be the same depending on whether or not it has ten comments. I happen to only need the latest comment date from the comment
table.
I came across a couple of solutions.
Group by post.id
but it doesn't work as the order by executed after the rows have grouped.
Retrieve the entire result set and sort through what I need with PHP. Using this method I'd probably take a massive performance hit.
Doing multiple queries but this is not considered idea.
Doing a subquery but I don't know if this is a bad practice or not.
What is the best thing to do? Is there anyway to stop MySQL from returning duplicate data on a join if it is a one to many relationship?
EDIT 1
As @Zane asked, this SQL query...
SELECT `user`.avatar, `user`.username, `user`.id AS user_id, `post`.title, `post`.id,
`post`.date AS post_date, MAX(`comment`.date)
FROM `post`
INNER JOIN `user` ON `post`.user_id = `user`.id
LEFT JOIN `comment` ON `post`.id = `comment`.post_id
WHERE `post`.category_id = 1
GROUP BY `post`.id
ORDER BY IFNULL(MAX(`comment`.date), `post_date`) DESC
produces;
Which seems like exactly what I want. The posts are being sorted by the latest comment date if they have comments (or post date) if they don't. I can use this with pagination LIMITS
as well without any preprocessing. Do I still need IFNULL
? I'd think so.
EDiT 2
There is still a defect. If a post has a NULL last_comment_date it doesn't sort correctly. If you refer to the the image below. You can see that the last_comment_date is sitting there even though a bump should've occurred.
Please ignore what's below.
EDIT 3
This is what I want.
Upvotes: 0
Views: 177
Reputation: 3867
I think the GROUP BY is your best choice. Simply group by all selected columns except comment.date and use MAX-function to get the latest comment date like this:
SELECT
`user`.avatar,
`user`.username,
`user`.id AS user_id,
`post`.title,
`post`.id,
`post`.date AS post_date,
MAX(`comment`.date) AS last_comment_date
FROM
`post` INNER JOIN `user` ON `post`.user_id = `user`.id
LEFT JOIN `comment` ON `post`.id = `comment`.post_id
WHERE
`post`.category_id = 1
GROUP BY
`user`.avatar,
`user`.username,
`user`.id,
`post`.title,
`post`.id,
`post`.date
ORDER BY
IFNULL(`last_comment_date`, `post_date`) DESC;
Upvotes: 2