Reputation: 55
I am new to MongoDB so I apologize if these questions are simple.
I am developing an application that will track specific user interactions and put information about the user and the interactions into a MongoDB. There are several types of interactions that will all collect different information from the user.
My First question is: Should all of these interaction be in the same collection or should I separate them out by types (as you would do in a RDBMS)?
Additionally I would like to be able to look up:
I was thinking of putting a Manual reference to an interaction document for each interaction a user performs in his document and a manual reference to the user that performed the interaction in each interaction document.
My second questions is: Does this "doubling up" of Manual references make sense or is there a better way to do this?
Any thoughts would be greatly appreciated.
Thank you!
Upvotes: 1
Views: 1058
Reputation: 69703
My First question is: Should all of these interaction be in the same collection or should I separate them out by types (as you would do in a RDBMS)?
The greatest advantage of a document store over a relational database is precisely that you can do that. Put all different interactions into one collection and don't be afraid to give them different sets of fields.
Additionally I would like to be able to look up:
All the interactions a specific user has made
I was thinking of putting a Manual reference to an interaction document for each interaction a user performs in his document and a manual reference to the user that performed the interaction in each interaction document.
Note that it's usually not a good idea to have documents which grow indefinitely. MongoDB has an upper limit for document size (per default:16MB). MongoDB isn't good at handling large documents, because documents are loaded completely into ram cache. When you have many large objects, not much will fit into the cache. Also, when documents grow, they sometimes need to be moved to another hard drive location, which slows down updates (that also screws with natural ordering, but you shouldn't rely on that anyway).
All the users that have made a specific interaction
Are you referring to a specific interaction instance (assuming that multiple users can be part of one interaction) or all users which already performed a specific interaction type?
In the latter case I would add an array of performed interaction types to the user document, because otherwise you would have to perform a join-like operation, which would either require a MapReduce or some application-sided logic.
The the first case I would, contrary to what Sammaye suggests, recommend to use not the _id field of the user collection, but rather the username. When you use an index with the unique flag on user.username, it's just as fast as searching by user._id and uniqueness is guaranteed.
The reason is that when you search for the interactions by a specific user, it's more likely that you know the username and not the id. When you only have the username and you are referencing the user by id, you first have to search the users collection to get the _id of the username, which is a additional database query.
This of course assumes that you don't always have the user._id at hand. When you do, you can of course use _id as reference.
Upvotes: 0
Reputation: 43884
My First question is: Should all of these interaction be in the same collection or should I separate them out by types (as you would do in a RDBMS)?
Without knowing too much about your data size, write amount, read amount, querying needs etc I would say; yes, all in one collection.
I am not sure if separating them out is how I would design this in a RDBMS either.
"Does this "doubling up" of Manual references make sense or is there a better way to do this?"
No it doesn't make sound databse design to me.
Putting a user_id
on the interaction collection document sounds good enough.
So when you want to get all user interactions you just query by the interactions collection user_id
.
When you want to do it the other way around you query for all interactions that fit your query area, pull out those user_id
s and then do a $in
clause on the user collection.
Upvotes: 2