Reputation: 12096
I have created a messaging system for users, it allows them to send a message to another user. If it is the first time they have spoken then a new conversation is initiated, if not the old conversation continues.
The users inbox lists all conversations the user has had with all other users, these are then ordered by the conversation which has the latest post in it.
A user can only have one conversation with another user.
When a user clicks one of these conversations they are taken to a page showing the whole conversation they've had with newest posts at the top. So it's kind of like a messaging chat functionality.
I have two tables:
userconversation
Contains an auto increment id which is the conversation id, along with the userId and the friendId.
Whoever initates the first conversation will always be userId and the recipient friendId, this will then never change for that conversation.
+----+--------+----------+
| id | userId | friendId |
+----+--------+----------+
usermessages
Contains the specific messages, along with a read flag, the time and conversationId
+----+---------+--------+------+------+----------------+
| id | message | userId | read | time | conversationId |
+----+---------+--------+------+------+----------------+
How it works
When a user goes to message another user, a query will run to check if both users have a match in the userconversation table, if so that conversationId
is used and the conversation carries on, if not a new row is created for them with a unique conversationId
.
Where it gets complicated
So far all is well, however when it comes to displaying the message inbox of all conversations, sorted on the latest post, it get's tricky to do with one query..
To be able to list the conversations you must first find the latest post of each conversation, but as you can't order by before a group this is impossible to do with one query on two tables, so I have to use the following:
SELECT
c.id,
c.userId,
c.friendId,
m2.message,
m2.read,
UNIX_TIMESTAMP(m2.time),
user1.username,
user2.username
FROM
(SELECT MAX(m1.id) AS MessageID
FROM usermessages m1
GROUP BY m1.conversationId) latest_msg
INNER JOIN usermessages m2 ON latest_msg.MessageID = m2.id
INNER JOIN userconversation c ON m2.conversationId = c.id
INNER JOIN user user1 ON c.userId = user.id
INNER JOIN user user2 ON c.friendId = user.id
WHERE c.userId = :userId OR c.friendId = :userId
ORDER BY m2.id DESC
LIMIT 10
I just don't think this is the best way it can be done but can't think of others ways too approach it?
The database table is InnoDB to speed up the joins and improve data integrity so I can't have two auto increment rows.
Is there another way I could get rid of the userconversation table and create a unique Id to put in the conversationId column? I could then just move the userId and friendId over to usermessages ... but this would create a lot of redundant data?
Upvotes: 28
Views: 2385
Reputation: 176
it is being used on fiverr.com and www.infinitbin.com. I developed the infinitbin own. It has two databases like yours too. the inbox table:-
+----+--------+----------+-------------+------------+--------------------------------+
| id | useridto | useridfrom | conversation | last_content | lastviewed | datecreated|
+----+--------+----------+-------------+------------+--------------------------------+
This table is very important, used to list the conversations/inbox. The last_content field is 140 charaters from the last message between the conversations. lastviewed is an integer field, the user who lasts sends a message is the last viewed, if the other user in the conversation reads the message. it gets updated to NULL. Therefore to get notifications, you for lastviewed that is not null and not the logged in user's id.
The conversation field is 'userid-userid', therefor strings. To check if users have started a conversation, you concatenate there user_ids with a hyphen and check it.
This kind of messaging system is a very complicating one.
The second table is quite simple.
+----+--------+----------+-------------+-------+
| id | inboxid | userid | content | datecreated|
+----+--------+----------+-------------+-------+
Upvotes: 1
Reputation: 164
The simpleste way I think for this is:
Tables:
conversation(cid | userId | friendId | last_message_id)
messages(mid | message | userId | read | time | cid)
Then update last_message_id after each message insert by users in particulate conversation.
And then run this simple query. It will give you what you want.
SELECT * FROM conversation c, messages m
WHERE (c.userId='$uid' OR c.friendId='$uid')
AND c.last_msg_id=m.message_id
ORDER BY created_time DESC
$uid is id of logged in user.
So in actual what this process is doing:
Upvotes: 0
Reputation: 5012
Extending the answer suggested by Watcher.
You should consider dropping the "conversation" concept to simplify further.
+----+---------+------+------------------+--------+----------+
| id | message | read | time | toUser | fromUser |
+----+---------+------+------------------+--------+----------+
| 1 | test 1 | 0 | (some timestamp) | 3 | 4 |
| 2 | test 2 | 0 | (some timestamp) | 4 | 3 |
+----+---------+------+------------------+--------+----------+
List of all conversations for user 123:
SELECT * FROM (
SELECT id, message, toUser, fromUser
FROM userMessages
WHERE toUser = 123 OR fromUser = 123
ORDER BY id DESC
) AS internalTable
GROUP BY toUser, fromUser
List entire conversation between user 123 and user 456:
SELECT *
FROM userMessages
WHERE (toUser = 123 OR fromUser = 123)
AND (toUser = 456 OR fromUser = 456)
ORDER BY time DESC
Upvotes: -1
Reputation: 16468
I think you do not need to create a userconversation table.
If only user can have only one conversation with someone, the unique id for this thread is a concat between userId and friendId. So I move the friendId column in usersmessage table. The problem of order (friendId-userId is the same thread of userId-friendId) can be solved so:
SELECT CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId
Now there is a problem of fetch the last message after a GROUP BY threadId.
I think is a good solution make a concat between DATE and message and after a MAX on this field.
I assume, for simplicity, column date is a DATETIME field ('YYYY-mm-dd H:i:s') but it not need because there is FROM_UNIXTIME function.
So the final query is
SELECT
CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId,
friendId, MAX(date) AS last_date,
MAX(CONCAT(date,"|",message)) AS last_date_and_message
FROM usermessages
WHERE userId = :userId OR friendId = :userId
GROUP BY threadId ORDER BY last_date DESC
the result of field last_date_and_message is something like so:
2012-05-18 00:18:54|Hi my friend this is my last message
it can be simply parsed from your server side code.
Upvotes: 1
Reputation: 173642
I would set it up like this
conversations (#id, last_message_id)
participation (#uid1, #uid2, conversation_id)
messages (#conversation_id, #id, uid, contents, read, *time)
conversations
This table will be used mainly to generate a new identifier for each conversation, together with a calculated field of the last update (for optimization). The two users have been disconnected from this table and moved into participation
.
participation
This table records the conversations between two users in both directions; to explain why, take a look at the following key:
ALTER TABLE `table` ADD PRIMARY(uid1, uid2);
While this is good for both enforcing the uniqueness and simple lookups, you should be aware of the following behavior:
SELECT * FROM table WHERE uid1=1 AND uid2=2
SELECT * FROM table WHERE uid1=1
SELECT * FROM table WHERE uid1=1 AND uid2>5
SELECT * FROM table WHERE uid2=2
The first two queries perform very well, MySQL also optimizes identity lookups on the first part of your key. The third one also yields pretty good performance as the second part of your key can be used for range queries. The last query doesn't perform well at all because the index is "left biased" and therefore it performs a full table scan.
messages
This table stores the actual sent messages, comprising the conversation identifier, sender id, contents, read flag and the time it was sent.
sending messages
To determine whether a conversation between two users has already been established you can simply query the participation
table:
SELECT conversation_id FROM participation WHERE uid1=:sender_id AND uid2=:receiver_id
If it does not yet exist, you create both records:
INSERT INTO conversations (last_message_id) VALUES (NULL);
# fetch last insert id here
INSERT INTO participation VALUES (:sender_id, :receiver_id, :conversation_id), (:receiver_id, :sender_id, :conversation_id);
INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW());
UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id
If the conversation is already setup: INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW()); UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id
Note: the UPDATE statement can be scheduled as LOW_PRIORITY because you don't always have to be 100% correct.
conversation overview
This has become a simpler query:
SELECT other_user.name, m.contents, m.read, c.id
FROM participation AS p
INNER JOIN user AS other_user ON other_user.id = p.uid2
INNER JOIN conversation AS c ON c.id = p.conversation_id
INNER JOIN messages AS m ON m.id = c.last_message_id
WHERE p.uid1 = :user_id
ORDER BY m.time DESC
LIMIT 50
Disclaimer: I have not tested this, but the write-up should make sense to you.
Another reason why it's good to have a two-way table is so that it's prepared for sharding, a method in which you push related data into another database (on a different machine); based on certain rules you would determine where to fetch the information from.
You could move the data in these ways:
participation
table up based on the uid1
fieldmessages
table up based on the conversation_id
fieldThe messages overview will get more complicated as you're likely being forced to make two queries; this can be mitigated somewhat with caches (and in extreme case document databases) though.
Hope this gives you some ideas on future planning :)
Upvotes: 1
Reputation: 24661
If you're asking for a way to be able to keep all of your current functionality and work flows, yet keep the data in a single table I think you're pretty close.
Instead of having the conversationId
be a key to a different table, I would instead have it point to the ID of the message that began the conversation. This would create a parent-child relationship between messages that began a conversation and all those that followed after it. To be able to see all conversations, you would just select all messages where the conversationId
is null. Below is a representation of a 2 message conversation:
+----+---------+------+------------------+----------------+--------+----------+
| id | message | read | time | conversationId | toUser | fromUser |
+----+---------+------+------------------+----------------+--------+----------+
| 1 | test 1 | 0 | (some timestamp) | null | 3 | 4 |
| 2 | test 2 | 0 | (some timestamp) | 1 | 4 | 3 |
+----+---------+------+------------------+----------------+--------+----------+
The conversation was initiated by user 3. All messages in the conversation can be filter by conversationId
. One limitation of this design is that only 2 users can be apart of the conversation.
Update
You could get the last message given a conversation id this way:
SELECT id, message
FROM userMessages
WHERE conversationId = {conversationId}
ORDER BY time DESC
LIMIT 1
Upvotes: 4
Reputation: 52137
Since a given pair of users can have at most one conversation, there is no need to "invent" separate key just to identify conversations. Also, the wording of your question seems to suggest that a message is always sent to a single user, so I'd probably go with something like this:
Now, there are several things to note about this model:
I1
is relatively expensive. There are ways to work around that, but the resulting complications are probably not worth it.With this data model, it becomes rather easy to sort the "conversations" (identified by user pairs) by the latest message. For example (replace 1
with desired user's USER_ID):
SELECT *
FROM (
SELECT USER1_ID, USER2_ID, MAX(SEND_TIME) NEWEST
FROM MESSAGE
WHERE (USER1_ID = 1 OR USER2_ID = 1)
GROUP BY USER1_ID, USER2_ID
) Q
ORDER BY NEWEST DESC;
(OR USER2_ID = 1
is the reason for the secondary index I1
.)
If you want not just latest times, but also latest messages, you can do something like this:
SELECT * FROM MESSAGE T1
WHERE
(USER1_ID = 1 OR USER2_ID = 1)
AND SEND_TIME = (
SELECT MAX(SEND_TIME)
FROM MESSAGE T2
WHERE
T1.USER1_ID = T2.USER1_ID
AND T1.USER2_ID = T2.USER2_ID
)
ORDER BY SEND_TIME DESC;
You can play with it in the SQL Fiddle.
1 If that's not the case, you can use monotonically-incrementing INT instead, but you'll have to SELECT MAX(...)
yourself since auto-increment doesn't work on PK subset; or simply make it PK alone and have secondary indexes on both USER1_ID and USER2_ID (fortunately, they would be slimmer since the PK is slimmer).
Upvotes: 3
Reputation: 170
I haven't tested this approach as I don't have access to mysqldb right now. But, I think you should be able to get this done by using a ranking function. Since mysql doesn't have an equivalent of row_number function of Oracle I think you can do it like this:
Select * from (
Select
uc.id,
uc.user_id,
uc.friend_id
um.message
um.read,
um.time,
@rownum := IF(@prev_val = um.conversation_id, @rownum + 1, 1) AS rank,
@prev_val := um.conversation_id
From
userconversation uc,
usermessages um,
(select @row_num:=1) rows,
(select @prev_val:='') partitions
Where
uc.id=um.conversation_id
and c.userId = 222 OR c.friendId = 222
Order By
um.conversation_id,um.id desc
)t where t.rank=1
Upvotes: 1
Reputation: 61
Why are you breaking up the data into conversations?
If it were me, I would use one table called 'usermessages' with the following format:
+----+--------+----------+-------------+------------+--------+
| id | userto | userfrom | timecreated | timeviewed | message|
+----+--------+----------+-------------+------------+--------+
A conversation is identified by the combination of the 'userto' and 'userfrom' columns. So, when you want to select all of a conversation:
SELECT * FROM usermessages
WHERE (userto = :userto OR userto = :userfrom)
AND (userfrom = :userfrom OR userfrom = :userto)
ORDER BY timecreated DESC
LIMIT 10
Upvotes: 1
Reputation: 20102
hmm maybe i'm not understanding correctly your problem... but to me the solution is quite simple:
SELECT c.*, MAX(m.time) as latest_post
FROM conversations as c
INNER JOIN messages as m ON c.id = m.conversation_id
WHERE c.userId = 222 OR c.friendId = 222
GROUP BY c.id
ORDER BY latest_post DESC
here's my test data:
Conversations :
id userId friendId
1 222 333
2 222 444
Messages :
id message time (Desc) conversation_id
14 rty 2012-05-14 19:59:55 2
13 cvb 2012-05-14 19:59:51 1
12 dfg 2012-05-14 19:59:46 2
11 ert 2012-05-14 19:59:42 1
1 foo 2012-05-14 19:22:57 2
2 bar 2012-05-14 19:22:57 2
3 foo 2012-05-14 19:14:13 1
8 wer 2012-05-13 19:59:37 2
9 sdf 2012-05-13 19:59:24 1
10 xcv 2012-05-11 19:59:32 2
4 bar 2012-05-10 19:58:06 1
6 zxc 2012-05-08 19:59:17 2
5 asd 2012-05-08 19:58:56 1
7 qwe 2012-05-04 19:59:20 1
Query result :
id userId friendId latest_post
2 222 444 2012-05-14 19:59:55
1 222 333 2012-05-14 19:59:51
If that's not it... just ignore my answer :P
Hope this helps
Upvotes: 6
Reputation: 425713
If you can only have one conversation between users, I don't see a reason for a dedicated conversations table. For this query to work fast, you would need a composite index on (user, message_id)
which is impossible if these fields are in different tables. Move user_id
and friend_id
to the userconversations
. This will make your table 8
bytes per record heavier (even assuming 8
-byte identifiers) which is hardly a problem for a table containing text messages.
If you have few conversations per user with many messages in each, use this:
SELECT um.*
FROM (
(
SELECT MAX(id) AS messageId
FROM usermessages m1
WHERE user_id = :me
GROUP BY
friend_id
ORDER BY
messageId DESC
LIMIT 10
)
UNION ALL
(
SELECT MAX(id) AS messageId
FROM usermessages m1
WHERE frient_id = :me
GROUP BY
user_id
ORDER BY
messageId DESC
LIMIT 10
)
) q
JOIN usermessages um
ON um.id = q.messageId
ORDER BY
id DESC
LIMIT 10
Create separate indexes on user_id
and friend_id
If you have many conversations with few messages in each, use this query:
(
SELECT *
FROM usermessages um
WHERE user_id = :me
AND id =
(
SELECT MAX(id)
FROM usermessages umi
WHERE umi.user_id = um.user_id
AND umi.friend_id = um.friend_id
)
ORDER BY
id DESC
LIMIT 10
)
UNION ALL
(
SELECT *
FROM usermessages um
WHERE frient_id = :me
AND id =
(
SELECT MAX(id)
FROM usermessages umi
WHERE umi.user_id = um.user_id
AND umi.friend_id = um.friend_id
)
ORDER BY
id DESC
LIMIT 10
)
ORDER BY
id DESC
LIMIT 10
The idea behind this query is that it just descends all messages for the given user, checking that each message is the last in its conversation. This may be much faster than sorting all last messages for all conversations (if you have many of them).
For this to work fast, create indexes on
friend_id
user_id, friend_id
Upvotes: 3
Reputation: 1905
How to create a fast Facebook-like messages system. tested and widely used by Arutz Sheva users - http://www.inn.co.il (Hebrew).
create a "topic" (conversation) table:
CREATE TABLEpb_topics
(t_id
int(11) NOT NULL AUTO_INCREMENT,t_last
int(11) NOT NULL DEFAULT '0',t_user
int(11) NOT NULL DEFAULT '0', PRIMARY KEY (t_id
), KEYlast
(t_last
) ) ENGINE=InnoDB AUTO_INCREMENT=137106342 DEFAULT CHARSET=utf8
create link between user and conversation:
CREATE TABLEpb_links
(l_id
int(11) NOT NULL AUTO_INCREMENT,l_user
int(11) NOT NULL DEFAULT '0',l_new
int(11) NOT NULL DEFAULT '0',l_topic
int(11) NOT NULL DEFAULT '0',l_visible
int(11) NOT NULL DEFAULT '1',l_bcc
int(11) NOT NULL DEFAULT '0', PRIMARY KEY (l_id
) USING BTREE, UNIQUE KEYtopic-user
(l_topic
,l_user
), KEYuser-topicnew
(l_user
,l_new
,l_topic
) USING BTREE, KEYuser-topic
(l_user
,l_visible
,l_topic
) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=64750078 DEFAULT CHARSET=utf8
create a message
CREATE TABLEpb_messages
(m_id
int(11) NOT NULL AUTO_INCREMENT,m_from
int(11) NOT NULL,m_date
datetime NOT NULL DEFAULT '1987-11-13 00:00:00',m_title
varchar(75) NOT NULL,m_content
mediumtext NOT NULL,m_topic
int(11) NOT NULL, PRIMARY KEY (m_id
), KEYdate_topic
(m_date
,m_topic
), KEYtopic_date_from
(m_topic
,m_date
,m_from
) ) ENGINE=InnoDB
A conversation can be with 2 or more friends (BCC was added like email, but you can skip it).
Insert a new message: 1. Create new topic 2. Create Links for users (from/to) 3. Add Message (4. Update users cache table - user have messages)
Add Message to topic: Add Message
Select folder:
select
z.*, group_concat(u_name) as users_name from
(select max(m_id) as m_id, m_topic as t_id, m_From, m_title,m_date, l_new
from pb_links as l1, pb_messages
where l1.l_user=<user> and m_from < If(inbox, "<>", "=") > and m_topic=l_topic and l1.l_visible=1
group by m_topic order by m_id desc limit " & iPage * 35 & ",35) z
left join pb_links l2 on (l2.l_topic=t_id)
left join users on (l_user=u_id and l_bcc=0 and l_user<user>)
group by l_topic order by m_date desc;
In details:
The first is the inner select - this is the fastest way (I was check about 7 other options, checked also in Percona/MariaDB versions) to get all messages, and get also the last message to display in the list. Also look in the inner IF - in inbox, the last message is anyone but not me, and In outbox - the opposite. LIMIT used for paging.
The outer one used to add user list (just name comma name string) and more information for only one message per topic, and after paging (I need to add user list just to the 35-per-page messages, and not for all my large history).
Also, I wrote in hebrew here: http://blogs.microsoft.co.il/blogs/moshel/archive/2010/08/12/quot-x-quot.aspx to create a simple cache table, and forbid the workload of select count from busy message table.
Upvotes: 1