David542
David542

Reputation: 110143

GROUP BY a specific order?

I am trying to do a GROUP BY statement, with the grouped by column showing the item with the newest timestamp. However, I don't think it's possible to order BEFORE a GROUP BY statement. Is the following subselect the only way to do what I'm trying to accomplish?

SELECT thread_id, content, timestamp FROM
(
  SELECT thread_id, content, timestamp FROM messaging_message 
  ORDER BY thread_id, timestamp desc
) combined 
GROUP BY thread_id    

Note that for a given thread_id, there may be multiple messages associated to it, and thus multiple content and timestamps for each thread_id.

Upvotes: 1

Views: 617

Answers (1)

Michael Berkowski
Michael Berkowski

Reputation: 270607

If I understand correctly and you want the most recent content per thread_id, use a MAX() aggregate to find the timestamp, and JOIN against it :

SELECT thread_id, content, timestamp 
FROM 
  messaging_message m
  JOIN (
     SELECT thread_id, MAX(timestamp) AS maxts
     FROM messaging_message 
     GROUP BY thread_id
  ) maxt ON m.thread_id = maxt.thread_id AND m.timestamp = maxt.maxts

The ORDER BY doesn't come into play at all by this method. It's all done by grouping.

MySQL, unlike other RDBMS doesn't strictly require you to have every SELECT column accounted for in the GROUP BY, so you could probably just do

SELECT thread_id, content, MAX(timestamp) AS maxts FROM messaging_message GROUP BY thread_id

However, that isn't portable and so I don't recommend it. Instead, the JOIN subquery returns the pair of timestamp and thread_id. Those are used to match up against the related content and any other columns you may need from the row. If you had a unique id on each row, you could also make a subquery which returns only the id for each MAX(timestamp) and use it inside an IN(). But absent that unique id, a join against the thread_id, timestamp pair does the job.

Upvotes: 4

Related Questions