Dan
Dan

Reputation: 12096

Is there a simpler way to achieve this style of user messaging?

I have created a messaging system for users, it allows them to send a message to another user. If it is the first time they have spoken then a new conversation is initiated, if not the old conversation continues.

The users inbox lists all conversations the user has had with all other users, these are then ordered by the conversation which has the latest post in it.

A user can only have one conversation with another user.

When a user clicks one of these conversations they are taken to a page showing the whole conversation they've had with newest posts at the top. So it's kind of like a messaging chat functionality.

I have two tables:

userconversation

Contains an auto increment id which is the conversation id, along with the userId and the friendId.

Whoever initates the first conversation will always be userId and the recipient friendId, this will then never change for that conversation.

+----+--------+----------+
| id | userId | friendId |
+----+--------+----------+

usermessages

Contains the specific messages, along with a read flag, the time and conversationId

+----+---------+--------+------+------+----------------+
| id | message | userId | read | time | conversationId |
+----+---------+--------+------+------+----------------+

How it works

When a user goes to message another user, a query will run to check if both users have a match in the userconversation table, if so that conversationId is used and the conversation carries on, if not a new row is created for them with a unique conversationId.

Where it gets complicated

So far all is well, however when it comes to displaying the message inbox of all conversations, sorted on the latest post, it get's tricky to do with one query..

To be able to list the conversations you must first find the latest post of each conversation, but as you can't order by before a group this is impossible to do with one query on two tables, so I have to use the following:

SELECT  
    c.id,
    c.userId,
    c.friendId,
    m2.message,
    m2.read,
    UNIX_TIMESTAMP(m2.time),      
    user1.username,
    user2.username  
FROM 
    (SELECT MAX(m1.id) AS MessageID 
     FROM usermessages m1 
     GROUP BY m1.conversationId) latest_msg

INNER JOIN usermessages m2 ON latest_msg.MessageID = m2.id 
INNER JOIN userconversation c ON m2.conversationId = c.id
INNER JOIN user user1 ON c.userId = user.id
INNER JOIN user user2 ON c.friendId = user.id

WHERE c.userId = :userId OR c.friendId = :userId
ORDER BY m2.id DESC
LIMIT 10

I just don't think this is the best way it can be done but can't think of others ways too approach it?

The database table is InnoDB to speed up the joins and improve data integrity so I can't have two auto increment rows.

Is there another way I could get rid of the userconversation table and create a unique Id to put in the conversationId column? I could then just move the userId and friendId over to usermessages ... but this would create a lot of redundant data?

Upvotes: 28

Views: 2385

Answers (12)

Spikes
Spikes

Reputation: 176

it is being used on fiverr.com and www.infinitbin.com. I developed the infinitbin own. It has two databases like yours too. the inbox table:-

+----+--------+----------+-------------+------------+--------------------------------+
| id | useridto | useridfrom | conversation | last_content | lastviewed | datecreated|
+----+--------+----------+-------------+------------+--------------------------------+

This table is very important, used to list the conversations/inbox. The last_content field is 140 charaters from the last message between the conversations. lastviewed is an integer field, the user who lasts sends a message is the last viewed, if the other user in the conversation reads the message. it gets updated to NULL. Therefore to get notifications, you for lastviewed that is not null and not the logged in user's id.

The conversation field is 'userid-userid', therefor strings. To check if users have started a conversation, you concatenate there user_ids with a hyphen and check it.

This kind of messaging system is a very complicating one.

The second table is quite simple.

+----+--------+----------+-------------+-------+
| id | inboxid | userid | content | datecreated|
+----+--------+----------+-------------+-------+

Upvotes: 1

Jadzia
Jadzia

Reputation: 164

The simpleste way I think for this is:

Tables:

conversation(cid | userId | friendId | last_message_id)
messages(mid | message | userId | read | time | cid)

Then update last_message_id after each message insert by users in particulate conversation.

And then run this simple query. It will give you what you want.

SELECT * FROM conversation c, messages m 
WHERE (c.userId='$uid' OR c.friendId='$uid')
AND c.last_msg_id=m.message_id
ORDER BY created_time DESC

$uid is id of logged in user.

So in actual what this process is doing:

  1. Displaying all conversation of logged in user.
  2. Referencing last message (so you do not need group by)
  3. And last displaying messages in order by desc.

Upvotes: 0

Matt Beckman
Matt Beckman

Reputation: 5012

Extending the answer suggested by Watcher.

You should consider dropping the "conversation" concept to simplify further.

+----+---------+------+------------------+--------+----------+
| id | message | read | time             | toUser | fromUser |
+----+---------+------+------------------+--------+----------+
| 1  |  test 1 |  0   | (some timestamp) |  3     |   4      |
| 2  |  test 2 |  0   | (some timestamp) |  4     |   3      |
+----+---------+------+------------------+--------+----------+

List of all conversations for user 123:

SELECT * FROM (
    SELECT id, message, toUser, fromUser   
    FROM userMessages 
    WHERE toUser = 123 OR fromUser = 123 
    ORDER BY id DESC
) AS internalTable 
GROUP BY toUser, fromUser 

List entire conversation between user 123 and user 456:

SELECT * 
FROM userMessages
WHERE (toUser = 123 OR fromUser = 123) 
AND (toUser = 456 OR fromUser = 456)
ORDER BY time DESC

Upvotes: -1

Luca Rainone
Luca Rainone

Reputation: 16468

I think you do not need to create a userconversation table.

If only user can have only one conversation with someone, the unique id for this thread is a concat between userId and friendId. So I move the friendId column in usersmessage table. The problem of order (friendId-userId is the same thread of userId-friendId) can be solved so:

SELECT CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId

Now there is a problem of fetch the last message after a GROUP BY threadId.

I think is a good solution make a concat between DATE and message and after a MAX on this field.

I assume, for simplicity, column date is a DATETIME field ('YYYY-mm-dd H:i:s') but it not need because there is FROM_UNIXTIME function.

So the final query is

SELECT 
        CONCAT(GREATEST(userId,FriendId),"_",LEAST(userId,FriendId)) AS threadId,
        friendId, MAX(date) AS last_date, 
        MAX(CONCAT(date,"|",message)) AS last_date_and_message 

FROM usermessages
WHERE userId = :userId OR friendId = :userId
GROUP BY threadId ORDER BY last_date DESC

the result of field last_date_and_message is something like so:

2012-05-18 00:18:54|Hi my friend this is my last message

it can be simply parsed from your server side code.

Upvotes: 1

Ja͢ck
Ja͢ck

Reputation: 173642

I would set it up like this

Table details

conversations (#id, last_message_id)

participation (#uid1, #uid2, conversation_id)

messages (#conversation_id, #id, uid, contents, read, *time)

conversations

This table will be used mainly to generate a new identifier for each conversation, together with a calculated field of the last update (for optimization). The two users have been disconnected from this table and moved into participation.

participation

This table records the conversations between two users in both directions; to explain why, take a look at the following key:

ALTER TABLE `table` ADD PRIMARY(uid1, uid2);

While this is good for both enforcing the uniqueness and simple lookups, you should be aware of the following behavior:

  • SELECT * FROM table WHERE uid1=1 AND uid2=2
  • SELECT * FROM table WHERE uid1=1
  • SELECT * FROM table WHERE uid1=1 AND uid2>5
  • SELECT * FROM table WHERE uid2=2

The first two queries perform very well, MySQL also optimizes identity lookups on the first part of your key. The third one also yields pretty good performance as the second part of your key can be used for range queries. The last query doesn't perform well at all because the index is "left biased" and therefore it performs a full table scan.

messages

This table stores the actual sent messages, comprising the conversation identifier, sender id, contents, read flag and the time it was sent.

Operation

sending messages

To determine whether a conversation between two users has already been established you can simply query the participation table:

SELECT conversation_id FROM participation WHERE uid1=:sender_id AND uid2=:receiver_id

If it does not yet exist, you create both records:

INSERT INTO conversations (last_message_id) VALUES (NULL);
# fetch last insert id here
INSERT INTO participation VALUES (:sender_id, :receiver_id, :conversation_id), (:receiver_id, :sender_id, :conversation_id);
INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW());
UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id

If the conversation is already setup: INSERT INTO messages VALUES (:conversation_id, 0, :sender_id, :message_contents, 0, NOW()); UPDATE conversations SET last_message_id=LAST_INSERT_ID() WHERE id = :conversation_id

Note: the UPDATE statement can be scheduled as LOW_PRIORITY because you don't always have to be 100% correct.

conversation overview

This has become a simpler query:

SELECT other_user.name, m.contents, m.read, c.id
FROM participation AS p
INNER JOIN user AS other_user ON other_user.id = p.uid2
INNER JOIN conversation AS c ON c.id = p.conversation_id
INNER JOIN messages AS m ON m.id = c.last_message_id
WHERE p.uid1 = :user_id
ORDER BY m.time DESC
LIMIT 50

Disclaimer: I have not tested this, but the write-up should make sense to you.

Optimization

Another reason why it's good to have a two-way table is so that it's prepared for sharding, a method in which you push related data into another database (on a different machine); based on certain rules you would determine where to fetch the information from.

You could move the data in these ways:

  1. divide the participation table up based on the uid1 field
  2. divide the messages table up based on the conversation_id field

The messages overview will get more complicated as you're likely being forced to make two queries; this can be mitigated somewhat with caches (and in extreme case document databases) though.

Hope this gives you some ideas on future planning :)

Upvotes: 1

Jeff Lambert
Jeff Lambert

Reputation: 24661

If you're asking for a way to be able to keep all of your current functionality and work flows, yet keep the data in a single table I think you're pretty close.

Instead of having the conversationId be a key to a different table, I would instead have it point to the ID of the message that began the conversation. This would create a parent-child relationship between messages that began a conversation and all those that followed after it. To be able to see all conversations, you would just select all messages where the conversationId is null. Below is a representation of a 2 message conversation:

+----+---------+------+------------------+----------------+--------+----------+
| id | message | read | time             | conversationId | toUser | fromUser |
+----+---------+------+------------------+----------------+--------+----------+
| 1  |  test 1 |  0   | (some timestamp) |  null          |  3     |   4      |
| 2  |  test 2 |  0   | (some timestamp) |   1            |  4     |   3      |
+----+---------+------+------------------+----------------+--------+----------+

The conversation was initiated by user 3. All messages in the conversation can be filter by conversationId. One limitation of this design is that only 2 users can be apart of the conversation.

Update

You could get the last message given a conversation id this way:

SELECT id, message 
FROM userMessages 
WHERE conversationId = {conversationId} 
ORDER BY time DESC 
LIMIT 1

Upvotes: 4

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52137

Since a given pair of users can have at most one conversation, there is no need to "invent" separate key just to identify conversations. Also, the wording of your question seems to suggest that a message is always sent to a single user, so I'd probably go with something like this:

enter image description here

Now, there are several things to note about this model:

  • It assumes messages between same two users cannot be generated more frequently than the resolution provided by the type used for SEND_TIME.1
  • The direction of the message is not determined by order of USER1_ID and USER2_ID, but with a separate flag (DIRECTION). This way, a message between given users will always have the same combination of USER1_ID and USER2_ID (enforced by the CHECK above), regardless of who sent and who received the message. This greatly simplifies querying.
  • It is unfortunate that all InnoDB tables are clustered, so the secondary index I1 is relatively expensive. There are ways to work around that, but the resulting complications are probably not worth it.

With this data model, it becomes rather easy to sort the "conversations" (identified by user pairs) by the latest message. For example (replace 1 with desired user's USER_ID):

SELECT *
FROM (
    SELECT USER1_ID, USER2_ID, MAX(SEND_TIME) NEWEST
    FROM MESSAGE
    WHERE (USER1_ID = 1 OR USER2_ID = 1)
    GROUP BY USER1_ID, USER2_ID
) Q
ORDER BY NEWEST DESC;

(OR USER2_ID = 1 is the reason for the secondary index I1.)

If you want not just latest times, but also latest messages, you can do something like this:

SELECT * FROM MESSAGE T1
WHERE
    (USER1_ID = 1 OR USER2_ID = 1)
    AND SEND_TIME = (
        SELECT MAX(SEND_TIME)
        FROM MESSAGE T2
        WHERE
            T1.USER1_ID = T2.USER1_ID
            AND T1.USER2_ID = T2.USER2_ID
    )
ORDER BY SEND_TIME DESC;

You can play with it in the SQL Fiddle.


1 If that's not the case, you can use monotonically-incrementing INT instead, but you'll have to SELECT MAX(...) yourself since auto-increment doesn't work on PK subset; or simply make it PK alone and have secondary indexes on both USER1_ID and USER2_ID (fortunately, they would be slimmer since the PK is slimmer).

Upvotes: 3

Rakesh Singh
Rakesh Singh

Reputation: 170

I haven't tested this approach as I don't have access to mysqldb right now. But, I think you should be able to get this done by using a ranking function. Since mysql doesn't have an equivalent of row_number function of Oracle I think you can do it like this:

Select * from (
Select 
    uc.id, 
    uc.user_id, 
    uc.friend_id 
    um.message
    um.read, 
    um.time,
    @rownum := IF(@prev_val = um.conversation_id, @rownum + 1, 1) AS rank,
    @prev_val := um.conversation_id
From
    userconversation uc,
    usermessages um,
    (select @row_num:=1) rows,
    (select @prev_val:='') partitions
Where 
    uc.id=um.conversation_id        
    and c.userId = 222 OR c.friendId = 222 

Order By 
    um.conversation_id,um.id desc
)t where t.rank=1

Upvotes: 1

Blake
Blake

Reputation: 61

Why are you breaking up the data into conversations?

If it were me, I would use one table called 'usermessages' with the following format:

+----+--------+----------+-------------+------------+--------+
| id | userto | userfrom | timecreated | timeviewed | message|
+----+--------+----------+-------------+------------+--------+

A conversation is identified by the combination of the 'userto' and 'userfrom' columns. So, when you want to select all of a conversation:

SELECT * FROM usermessages 
WHERE (userto = :userto OR userto = :userfrom) 
AND (userfrom = :userfrom OR userfrom = :userto) 
ORDER BY timecreated DESC 
LIMIT 10

Upvotes: 1

pleasedontbelong
pleasedontbelong

Reputation: 20102

hmm maybe i'm not understanding correctly your problem... but to me the solution is quite simple:

SELECT c.*, MAX(m.time) as latest_post 
FROM conversations as c 
INNER JOIN messages as m ON c.id = m.conversation_id
WHERE c.userId = 222 OR c.friendId = 222 
GROUP BY c.id
ORDER BY latest_post DESC

here's my test data:

Conversations :

id  userId  friendId
1   222     333
2   222     444

Messages :

id  message     time (Desc)     conversation_id
14  rty     2012-05-14 19:59:55     2
13  cvb     2012-05-14 19:59:51     1
12  dfg     2012-05-14 19:59:46     2
11  ert     2012-05-14 19:59:42     1
1   foo     2012-05-14 19:22:57     2
2   bar     2012-05-14 19:22:57     2
3   foo     2012-05-14 19:14:13     1
8   wer     2012-05-13 19:59:37     2
9   sdf     2012-05-13 19:59:24     1
10  xcv     2012-05-11 19:59:32     2
4   bar     2012-05-10 19:58:06     1
6   zxc     2012-05-08 19:59:17     2
5   asd     2012-05-08 19:58:56     1
7   qwe     2012-05-04 19:59:20     1

Query result :

id  userId  friendId    latest_post
2   222     444     2012-05-14 19:59:55
1   222     333     2012-05-14 19:59:51

If that's not it... just ignore my answer :P

Hope this helps

Upvotes: 6

Quassnoi
Quassnoi

Reputation: 425713

If you can only have one conversation between users, I don't see a reason for a dedicated conversations table. For this query to work fast, you would need a composite index on (user, message_id) which is impossible if these fields are in different tables. Move user_id and friend_id to the userconversations. This will make your table 8 bytes per record heavier (even assuming 8-byte identifiers) which is hardly a problem for a table containing text messages.

If you have few conversations per user with many messages in each, use this:

SELECT  um.*
FROM    (
        (
        SELECT  MAX(id) AS messageId
        FROM    usermessages m1
        WHERE   user_id = :me
        GROUP BY
                friend_id
        ORDER BY
                messageId DESC
        LIMIT 10
        )
        UNION ALL
        (
        SELECT  MAX(id) AS messageId
        FROM    usermessages m1
        WHERE   frient_id = :me
        GROUP BY
                user_id
        ORDER BY
                messageId DESC
        LIMIT 10
        )
        ) q
JOIN    usermessages um
ON      um.id = q.messageId
ORDER BY
        id DESC
LIMIT 10

Create separate indexes on user_id and friend_id

If you have many conversations with few messages in each, use this query:

(
SELECT  *
FROM    usermessages um
WHERE   user_id = :me
        AND id = 
        (
        SELECT  MAX(id)
        FROM    usermessages umi
        WHERE   umi.user_id = um.user_id
                AND umi.friend_id = um.friend_id
        )
ORDER BY
        id DESC
LIMIT 10
)
UNION ALL
(
SELECT  *
FROM    usermessages um
WHERE   frient_id = :me
        AND id = 
        (
        SELECT  MAX(id)
        FROM    usermessages umi
        WHERE   umi.user_id = um.user_id
                AND umi.friend_id = um.friend_id
        )
ORDER BY
        id DESC
LIMIT 10
)
ORDER BY
        id DESC
LIMIT 10

The idea behind this query is that it just descends all messages for the given user, checking that each message is the last in its conversation. This may be much faster than sorting all last messages for all conversations (if you have many of them).

For this to work fast, create indexes on

friend_id
user_id, friend_id

Upvotes: 3

Moshe L
Moshe L

Reputation: 1905

How to create a fast Facebook-like messages system. tested and widely used by Arutz Sheva users - http://www.inn.co.il (Hebrew).

  1. create a "topic" (conversation) table:

      CREATE TABLE pb_topics (
      t_id int(11) NOT NULL AUTO_INCREMENT,
      t_last int(11) NOT NULL DEFAULT '0',
      t_user int(11) NOT NULL DEFAULT '0',
      PRIMARY KEY (t_id),
      KEY last (t_last)
    ) ENGINE=InnoDB AUTO_INCREMENT=137106342 DEFAULT CHARSET=utf8

  2. create link between user and conversation:

        CREATE TABLE pb_links (
      l_id int(11) NOT NULL AUTO_INCREMENT,
      l_user int(11) NOT NULL DEFAULT '0',
      l_new int(11) NOT NULL DEFAULT '0',
      l_topic int(11) NOT NULL DEFAULT '0',
      l_visible int(11) NOT NULL DEFAULT '1',
      l_bcc int(11) NOT NULL DEFAULT '0',
      PRIMARY KEY (l_id) USING BTREE,
      UNIQUE KEY topic-user (l_topic,l_user),
      KEY user-topicnew (l_user,l_new,l_topic) USING BTREE,
      KEY user-topic (l_user,l_visible,l_topic) USING BTREE
    ) ENGINE=InnoDB AUTO_INCREMENT=64750078 DEFAULT CHARSET=utf8

  3. create a message

        CREATE TABLE pb_messages (
      m_id int(11) NOT NULL AUTO_INCREMENT,
      m_from int(11) NOT NULL,
      m_date datetime NOT NULL DEFAULT '1987-11-13 00:00:00',
      m_title varchar(75) NOT NULL,
      m_content mediumtext NOT NULL,
      m_topic int(11) NOT NULL,
      PRIMARY KEY (m_id),
      KEY date_topic (m_date,m_topic),
      KEY topic_date_from (m_topic,m_date,m_from)
    ) ENGINE=InnoDB 

A conversation can be with 2 or more friends (BCC was added like email, but you can skip it).

Insert a new message: 1. Create new topic 2. Create Links for users (from/to) 3. Add Message (4. Update users cache table - user have messages)

Add Message to topic: Add Message

Select folder:

select 
     z.*, group_concat(u_name) as users_name from
         (select max(m_id) as m_id, m_topic as t_id,  m_From, m_title,m_date, l_new 
              from pb_links as l1, pb_messages 
              where l1.l_user=<user>  and m_from < If(inbox, "<>", "=") > and m_topic=l_topic and l1.l_visible=1 
               group by m_topic order by m_id desc limit " & iPage * 35 & ",35) z
           left join  pb_links l2  on (l2.l_topic=t_id)
           left join  users  on (l_user=u_id and l_bcc=0 and l_user<user>)  
            group by l_topic order by m_date desc;

In details:

The first is the inner select - this is the fastest way (I was check about 7 other options, checked also in Percona/MariaDB versions) to get all messages, and get also the last message to display in the list. Also look in the inner IF - in inbox, the last message is anyone but not me, and In outbox - the opposite. LIMIT used for paging.

The outer one used to add user list (just name comma name string) and more information for only one message per topic, and after paging (I need to add user list just to the 35-per-page messages, and not for all my large history).

Also, I wrote in hebrew here: http://blogs.microsoft.co.il/blogs/moshel/archive/2010/08/12/quot-x-quot.aspx to create a simple cache table, and forbid the workload of select count from busy message table.

Upvotes: 1

Related Questions