Reputation: 366
I am working on a web application using MongoDB,and I have some questions about the schema design.
What I want to do is to use Mongo store energy consumption data for each user. For each user, we will have the data for electricity consumption, which is a time stamp and consumption.
So the questions is how to store them in Mongo, and I have two ways of doing it.
Put everything in one Collection. So it will have like this:
{"user_id": "e211a233-808f-fc43-0800-c05650001785","Value": 274,"Time": 1314691200}
So, each user may have thousands of data, and we have thousands of users. So there would be tens of millions documents in one collection.
Put the data of one user in one Collection. So we will have thousands of collections and thousands of document in each collection.
Can anyone help me which approach is better considering performance?
Upvotes: 2
Views: 2197
Reputation: 311
For any new refers for this question:
mongoDB has some very useful video tutorials regarding this specific problem. see the following links and it will help you for sure:
Upvotes: 2
Reputation: 5662
Option 1 will leverage your indexes and scale out well. It'll be much easier to query and update effieciently than massive documents that are always changing. It will also make your queries much easier if you plan on aggregating that data in the future. Specifically, using the Aggregation Framework over documents is much more efficient than arrays within documents which have to be unwound first.
Also, if you planned on having in the region of 150K entries like that, it would exceed the 16MB single document limit, so I think you're almost always better off with single documents in a big collection as per option 1.
[Update]
Looking again, I see that you you haven't mentioned what queries you would make over the data. That's the key to this. But given that it looks as if your results are historical It looks more and more towards putting the data into millions of documents. Map-Reduce would be your friend here for analysis.
Upvotes: 1
Reputation: 2480
You can do with option 1 and also shard your data across multiple nodes for performance.
Alternatively if it's an option, I would personally keep a daily entry for each user and then use
db.coll.update(
{ _id : userId, date: '12/11/2012' },
{ $inc : { consumption : value } },
true // insert the document if it does not exist and init consumption with 0
)
If you will not be querying the data too often, you can also add the entries in a single daily document in a day collection like so:
db.days.update( { day: '12/11/2012' } ,
{ $addToSet :
{ todaysConsumptions : { userId : id, consumption: value, time: timestamp }
}
}
The way to query data from this last method would be to use the aggregation framework with the $unwind operation on the todaysConsumptions field. $unwind essentially converts an embedded array field into column like data which can then be grouped, summed, counted etc.
Upvotes: 1