Vishnu M Raveendran
Vishnu M Raveendran

Reputation: 411

Chat document structure MongoDB

I'm building a chat app and using mongo for storage. I have built a document structure.

{
    _id:
    sender_id:
    receiver_id:
    subject:
    created_at:
    updated_at:
    messages: [
        {
            _id:
            message:
            author_id:
            attatchments: [x,y,z],
            read:
            created_at:
        },
        {
            _id:
            message:
            author_id:
            attatchments: [x,y,z],
            read:
            created_at:
        }
    ]
}

I'm confused whether this is a good approach when it comes to performance and document size. Is there any better way to do it or this is fine??

Thanks in advance

Upvotes: 1

Views: 6317

Answers (3)

varaprasadh
varaprasadh

Reputation: 505

for the chat, its better to normalize as the relational schemas. otherwise it's painful to manage nested things if you need to update or do something complex.

this is better to way to implement the chat,and it also holds the use case for if user want to delete the message/conversation for him only.by tracking deleted_by property if deleted_by equals to participants of the conversation,permanently remove the conversation/message!

conversation schema

{
 id:String,
 participants:[String], //user ids
 created_at:Date,
 deleted_by:[String]
 ...
}

message schema

{
 id:String,
 conversation_id:String,
 sender:String,
 content:String,
 read_by:[String] //user ids
 deleted_by:[String]
 ...
}

Upvotes: 1

Amit Phaltankar
Amit Phaltankar

Reputation: 3424

In Mongo data is stored in the form you want to query it.

The chat problem can be easily addressed with Relational Stores, however if you are keen to use Mongo, IMO flat structure is the best one.

You may create a unique chatId for each pair of a sender and receiver. Store each chat messages as a separate documents.

{
    _id:
    chatId: 1234,
    sender_id:
    receiver_id:
    subject:
    updated_at:
    message: {
            message:
            messageId: 1,
            author_id:
            attatchments: [x,y,z],
            read:
            created_at:
            }
},
{
    _id:
    chatId: 1234,
    sender_id:
    receiver_id:
    subject:
    updated_at:
    message: {
            message:
            messageId: 2
            author_id:
            attatchments: [x,y,z],
            read:
            created_at:
            }
}

Chats will happen message by message (and not in batch). The flat structure gives me quick read/write but also help me providing a search.

I can even provide pagination, something like show last 20 messages in a window where use can click to load more. (Something like below)

db.collection.find(
  {chatId: 1234, message.messageId: {$gte:1}
).sort({updated_at : -1})
.limit(20)

No doubt, the number of documents will grow very fast but Mongo reads are always awesome when you have proper indexes on your fields.



At the end, read my first line again. "In Mongo data is stored in the form you want to query it". Having large number of documents is not a problem if you have correct indexes and that is a basic quality of any data stores. Considering mongo's array operators, I won't favour having an array of messages.

Consider you have a single document per chat with array of messages and there are 10K (or imagine any large number) messages with attachments. Do you want to load all of them in-memory when you query the chat document ? or you are just interested into latest 1 or 2 or 20 messages?

Now, thing about splitting a single collection in two relational collections: IMO go for any Relational Data stores.


How to take Decision:

Best way to design it is to list down your requirement. If you are exposing your chat store as a service, make a list of the endpoints that service is going to expose.

How many different type of queries you may need to execute in near future.

What will be the search keys.

How many chat messages you want to return in a single API call.

and etc All these answers will help you design your data store.

Upvotes: 8

Rupesh
Rupesh

Reputation: 890

If you want you can devide your schema like this.

// coversation Schema
 {
    _id:
    sender_id:
    receiver_id:
    subject:
    created_at:
    updated_at:
    messagesId: [ ] //here you will store the _id of conversation occur between both. 
}



// Message Schema
 {   
   _id :             
   message:
   author_id:
   attatchments: [x,y,z],
   read:
   created_at:
 }

Upvotes: 1

Related Questions