Reputation: 24207
A data structure might look like this:
{
post_title : String,
post_date : Date,
user_id : ObjectID,
post_content : String,
comments : [
{
comment_date : Date,
comment_content : String,
user_id : ObjectID
}
]
}
The system I am working on has a similar structure to the above. The content contained in the post_* objects will likely never change, but the content in the comments section will be updated and edited very often.
Since the above structure is a single document, updating or adding a single comment requires reading the whole document, editing it and saving it. It also makes caching difficult because although the post_* content can be cached for a long time, the comments cant.
What is the best strategy here? Is it better to give the comments their own collection?
As far as query time goes, I will still need to hit the database to extract the comments, but when updating or adding comments, the size of the document will be much smaller.
Upvotes: 2
Views: 161
Reputation: 5548
Another thing to consider is do you know in advance how large you expect your "comments" array to become?
Each mongo document has a size limit of 16mb. This is rarely an issue, but it is something to keep in mind, and a reason to avoid adding sub-documents "ad infinitum" to an embedded array.
Furthermore, Mongo preallocates space for documents to grow. (Please see the documentation on "Padding Factor" for more information: http://www.mongodb.org/display/DOCS/Padding+Factor) If enough embedded documents are pushed to an array to cause a document to grow beyond its preallocated slot on disk, then the entire document will have to be moved to a different location on disk. You may find that this causes undesired diskIO.
If you anticipate that each document will have a maximum number of embedded documents, a common practice is to prepopulate the array of new documents when they are added. For example, if you anticipate that each post will have 100 comments, the best practice is to create new post documents with a "comments" array that contains 100 embedded documents with 'garbage' data. Once the new document is created, the disk space will be preallocated, and the garbage data may be deleted, leaving room for the document to grow in size.
Hopefully this will give you some additional food for thought when designing your document structure.
Upvotes: 1
Reputation: 6041
What is the sense of storing comments in nested collection ? I suggest you to use another collection for comments with DBRef or even with manual referring.
Size of the document is not the only one problem. (I don't think, that this is problem at all) One of the common task - show users last N comments. It's rather hard to do with your structure.
I used your structure for my application, later I had to rewrite it with standalone collection
Upvotes: 1
Reputation: 51319
In Mongo you can append to an array without reading it. See the $push command. Doesn't help you with regards to caching, but it removes the need to read the document before updating it.
Upvotes: 2