Reputation: 221
I am developing a social networking site where users can post status, share/like/comments on others' posts among other features. I am using mongodb + nodejs and I am running into the problem of whether to embed or to reference documents that are used to store those data.
I have a collection called "activity" that stores all activities performed by users like share/post/like/comment with a "type" field that specifies the type of activity that user performs. If I perform a "comment" action, how should I store that information ? I want the user who comment a post also share that post to his/her friends so that they can see both the comments and the post. should I duplicate the same content that user shares and embed it in one new document with type "share" or should I just store a reference to that content ?
schematically, should I do:
embedding:
var activity = new Schema({
type:String // specifies the type of activity
content: [{
// the object that user share/like/comments on.
}]
});
or
2.referencing:
var activity = new Schema({
type:String // in this case would be "share",
content_id: Schema.Types.ObjectId // the id of the thing I share.
});
Using the embedding approach, the "commented" content that get embedded would not contain the latest comments since it is a duplicate of the original and any new comments after it is shared would be updated.
Using the referencing approach, I have a lot of difficulty retrieving those results without having a mess of callbacks(as some object referenced also references other object like posts will reference comments ).
What should be the best practices in my situations ?
Upvotes: 3
Views: 6099
Reputation: 720
Jeremy,
You may want to answer following few questions to get gain better understanding on how you may end up using this data in future. Going through some of these questions will also head you into a specific direction of schema design:
Is there a limit of on type of activity for a user? i.e. the main intent here is to identify whether a very active user going to hit the document size limit of 16MB. If so, embedding an array of activity may not help.
Are these records (activity array) going to be indexed / queried by the in future? i.e. the intent here is to identify on slow / heavy updates (addition of new activity) by the change in size of document causing relocation and index updates. As your activity array size grows for the user, the performance would suffer gradually due to number of index entries it will need to maintain.
How is the use of these activity from user point of view, are these all viewed at the same time, or is it paginated. If paginated, may be duplicate the activity with most recent 'N' activity entries in the main document and others are stored separately.
I would also suggest at looking at a use-case for comments defined at http://docs.mongodb.org/ecosystem/use-cases/storing-comments/ .
Hope that helps you make decision that serves you well in long term.
Upvotes: 3