Reputation: 1159
I am reading on MongoDB and still cannot understand in what cases should I prefer document-oriented DBs over traditional ones like Postgres.
Consider the following: I have a blog, which consists of posts, and users can write comments on posts. So I create two collections: users
and posts
. Each post
contains an array of embedded documents - comments
, so that each time I load a post
from DB, I automatically get all comments
. This is fine and in fact Mongo seems to be a better choice if I know for sure that relations won't change. But everything does.
Suppose I will decide to show user's comments on his/her profile page. I will have to create an index on posts.comments.user_id
and select from posts
by user._id
. So this is basically the same thing I do in RDBMS. Or I'll have to copy all comments
from posts
to users
and manually make sure these two sets stay synched in the future (no triggers & no transactions in Mongo).
Now I add a new field to posts, like rating
and for those records that already exist I want to set rating = number of comments
. But I can't do this kind of mass-update. Either I set the same value for all updated documents ({ $set: {rating: 1} }
) or I have to manually loop one-by-one.
How do I find most active users? Because there are no joins
- first, I will have to extract top posts.comments.user_id
and keep them somewhere. Then go one _id
at a time and select user.name
. This can be solved in by keeping user.name
in comment
. But what if later I will need more fields?
Have I got something wrong? These three cases seem to be simple extensions for a blog application. If I am not 100% sure from the beginning about the structure of relations and statistics, and ... (which in 99% real cases) I shouldn't be using Mongo?
Upvotes: 0
Views: 267
Reputation: 356
You simply solve your problem through your programming language. It's instinct to modify your SQL query to vary your results, but you'd simply use your javascript to pull the new piece of data in.
Upvotes: 0
Reputation: 3760
Very interesting and instructive questions.
Changes in requirements mean changes in the software. Moving the comments from the posts to the user would definitely be a pretty drastic change, but I would still argue that with MongoDB a change like that can be less painful than with RDBMS. For example, the application could be modified so it can deal with both variants at the same time: comments in posts and comments in users. This way, the data could be migrated to the new structure over time rather than all at once. -- Keeping comments both in the user and in the post seems a rather bad decision to me though. If the data changes often, that is a strong argument not to denormalize it.
The update you describe is trivial to implement in any programming language or in MongoDB's native javascript shell. The fact that the query language itself doesn't allow that kind of complex update does not mean that it cannot be done easily with MongoDB. I would argue that the query language being more stream-lined than full-blown SQL is actually a good thing.
Careful denormalizing (by putting user.name into the comment itself) can indeed help here. Since user.name would rarely change, that is a much better case for that kind of denormalization than putting the comments into both users and posts. Once again, the flexibility of MongoDB begins to shine if you do this selectively, dynamically — you can put the user.name into a comment when you get to it, but the application can be written so it can handle both cases: if the user.name is not in the comment it is looked up (and then maybe written to the comment to save future lookups). You can "cache" more attributes in the comment as you go.
In general, MongoDB leads to a more flexible, agile way of thinking about data structures and implementing them in the database. Many programmers find that more intuitive and easier to work with than SQL's rigidity. But there is no technology that has only advantages, and you can always find problems where one approach leads to a better, more elegant solution than the other.
Upvotes: 2