Reputation: 477
I have a messaging system.
threads
+----+-------+
| id | title |
+----+-------+
| PK | TEXT |
+----+-------+
messages
+----+--------------+----------------+-----------+-------------+---------+
| id | from_id | thread_id | sent | parent | message |
+----+--------------+----------------+-----------+-------------+---------+
| PK | FK(users.id) | FK(threads.id) | TIMESTAMP | messages.id | TEXT |
+----+--------------+----------------+-----------+-------------+---------+
recipients
+----+-----------------+--------------+--------+
| id | msg_id | to_id | status |
+----+-----------------+--------------+--------+
| PK | FK(messages.id) | FK(users.id) | ENUM |
+----+-----------------+--------------+--------+
users
+----+---------+
| id | name |
+----+---------+
| PK | VARCHAR |
+----+---------+
Essentially, it's a messaging system in which:
So hopefully my schema is correct.
I wish to get a list of all threads, which shows the most recent message inside the thread, and the author of that message:
+----------+------------+-----------+--------------+------------------+---------------+-------------------+
| users.id | users.name | thread.id | thread.title | messages.message | messages.sent | recipients.status |
+----------+------------+-----------+--------------+------------------+---------------+-------------------+
The issue is getting the most recent message as part of the query. Given that recipients.status = 1 means unread.. Ignoring users for now (that'd be a relatively simple join onto the rest of the tables...), also assuming we want user 1's threads:
SELECT threads.id, title, message, sent, recipients.status
FROM recipients
JOIN messages
ON messages.id=recipients.msg_id
JOIN threads ON threads.id=messages.thread_id
WHERE recipients.to_id=1
AND recipients.status=1
This gets me all messages in all threads that the user is participating in. However, I only need the most recent, and that is where I am stuck.
One solution which I am very not fond of (are there any reasons NOT to do this?)
SELECT *
FROM (
SELECT threads.id, title, message, sent, recipients.status
FROM recipients
JOIN messages
ON messages.id=recipients.msg_id
JOIN threads ON threads.id=messages.thread_id
WHERE recipients.to_id=1
AND recipients.status=1
ORDER BY sent DESC
) a
GROUP BY id
Upvotes: 1
Views: 93
Reputation: 32402
One solution which I am very not fond of (are there any reasons NOT to do this?)
Your query will not necessarily select the row with the latest sent
value for each thread. Even though your inner query orders by sent DESC
, mysql is free to choose any value from each group:
https://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses.
I recommend using variables to emulate row_number()
to number messages within a thread in order of when they were sent (i.e. most recent sent message within a thread will be #1, 2nd most recent #2, etc) and then to only keep #1 messages.
SELECT * FROM (
SELECT threads.id, title, message, sent, recipients.status,
@rowNumber := IF(@prevId = threads.id,@rowNumber+1,1) rowNumber,
@prevId := threads.id
FROM recipients
JOIN messages
ON messages.id=recipients.msg_id
JOIN threads ON threads.id=messages.thread_id
WHERE recipients.to_id=1
AND recipients.status=1
ORDER BY threads.id, sent DESC
) t1 WHERE rowNumber = 1
Edit
Another way using not exists
to only select messages where a more recent message in the same thread does not exist.
SELECT threads.id, title, message, sent, recipients.status
FROM recipients
JOIN messages
ON messages.id=recipients.msg_id
JOIN threads ON threads.id=messages.thread_id
WHERE recipients.to_id=1
AND recipients.status=1
AND NOT EXISTS (
SELECT 1 FROM threads t2
WHERE t2.id = threads.id
AND t2.sent > threads.sent
)
Upvotes: 1
Reputation: 477
I was able to accomplish it with this:
SELECT threads.id AS thread_id, threads.title, users.id AS user_id, users.name, m1.message, m1.sent
FROM messages m1
LEFT JOIN messages m2
ON m1.thread_id = m2.thread_id AND m1.sent < m2.sent
JOIN recipients
ON recipients.status=1
AND recipients.msg_id=m1.id
AND recipients.to_id=1
JOIN threads
ON threads.id=m1.thread_id
JOIN users
ON m1.from_id=users.id
WHERE m2.sent IS NULL
The relevant parts to the problem would be:
SELECT ...
FROM messages m1
LEFT JOIN messages m2
ON m1.thread_id = m2.thread_id AND m1.sent < m2.sent
WHERE m2.sent IS NULL
Upvotes: 0
Reputation: 108841
The most recent message id from each thread can be gotten thusly.
SELECT MAX(id) AS most_recent_message_id,
thread_id
FROM messages
GROUP BY thread_id
Do you want this to be efficient? In that case create a compound index on (thread_id, id)
.
If you want a list of threads in which a given user (let's say user 42) is participating either as originator or recipient, you need to use a UNION operator
SELECT DISTINCT thread_id FROM messages WHERE user_id = 42 UNION SELECT DISTINCT thread_id FROM recipients JOIN messages ON recipients.msg_id = messages.id WHERE recipients.to_id = 42
This gets you the threads in which a user is participating.
So, if you want the ids of the most recent messages in which your user is participating (either as originator or recipient) you join those two subqueries
SELECT most_recent_message_id
FROM (
SELECT MAX(id) AS most_recent_message_id,
thread_id
FROM messages
GROUP BY thread_id
) AS a
JOIN (
SELECT DISTINCT thread_id
FROM messages
WHERE user_id = 42
UNION
SELECT DISTINCT thread_id
FROM recipients
JOIN messages ON recipients.msg_id = messages.id
WHERE recipients.to_id = 42
) AS b ON a.thread_id = b.thread_id
See how this goes? You use the aggregation MAX()
and set-construction (DISTINCT
, UNION
) features of SQL to construct the lists of relevant items, then you JOIN to get the list you want.
I assume that once you have an appropriate list of messages, you can go get the content you need with another join or two.
Upvotes: 0
Reputation: 226
Same answer than SoftwareCarpente, just add an Order By desc on the message id (or the timestamp) and add a LIMIT 1 if you only need the 1st row.
Upvotes: 0