sirjay
sirjay

Reputation: 1766

how should i store data in mongodb / nosql?

I am writing a large social network on node.js and mongodb (mongoose module). That mean there will be many users and large data in database.

I have created user registration and now i need to allow users write private messages to each other.

Questions:

1) how should i store data about sending private messages? i have thought 2 ways:

first

var schemaUser = new mongoose.Schema({
    i: Number,
    ...
    message: { type: Schema.ObjectId, ref: 'Message' }
});
var schemaMessage = new mongoose.Schema({
    m: [{
        f: Number, // value i from schemaUser, means from user
        m: String, // message
        d: { type: Date, default: Date.now } // date
    }]
});
module.exports = {
    User: db.model('User', schemaUser),
    Message: db.model('Message', schemaMessage)
}

In this way every user has message field to Message table, where he has only one collection m, where in array stores all messages.

second

I store in Messages all messages like this:

var schemaMessage = new mongoose.Schema({
    t: Number, // means to what user this messages sent
    f: Number, // value i from schemaUser, means from what user message sent
    m: String, // message
    d: { type: Date, default: Date.now } // date
});

All messages are mixed in one table. But as i understood, the disadvantage of this method is that there are might be more than million private messages in database, and that is why the speed and performance to find messages from user and to user was sent will be bad. when on first method all messages are in array.

So, which way should i choose or any other ideas?

2) I have array like on first method: var arr = [] question: how many objects can i put in arr? what is the size of arr? for example, if i push something like arr.push({t: #, f: #, m: 'message...'})?

Upvotes: 3

Views: 7621

Answers (1)

Philipp
Philipp

Reputation: 69663

In general, MongoDB encourages embedding of data instead of relations, because this allow to get all relevant data with a single query. There is, however, an exception: MongoDB doesn't like documents which grow indefinitely.

When a document gradually grows over its lifetime, the database has to reallocate hard drive space frequently. This slows down writes and leads to database fragmentation. Also, documents have a hardcoded size limit of 16MB (mostly to discourage document growth). A user accumulating more and more private messages during his membership would be a good example of indefinite growth.

In your situation it is important to identify your most frequent use-case. How are you going to present private messages to the user? Will they see all messages they ever got with their full text on one long HTML page? Unlikely.

You likely want to list them like an email inbox with sender and headline, and show the actual content when the user clicks on them. When that's the case, you should just store an array with this meta-data, and store the actual content in a different collection which is queried when the user actually clicks on a message. You still have growth that way, but it would be less of a problem, because you have a lot less data per message stored in the user document.

It's also likely that you only want to show those messages on each normal page impression which are unread, while a complete list of old private messages is a special page which is used less often. When that's the case, you would only embed the unread messages in the User document, and move them to another collection after they are read. This prevents document growth because most users will keep their list of unread messages low.

Upvotes: 7

Related Questions