Reputation: 18580
I have a time series that grows and is (potentially) revised through time:
"2013-01-01" 10
"2013-01-01" 11
"2013-01-01" 11
"2013-02-01" 20
"2013-01-01" 11
"2013-02-01" 21
"2013-01-01" 11
"2013-02-01" 21
For instance, querying with "2013-02-01", I need to get
"2013-01-01" 11
"2013-02-01" 20
I need help to structure my documents, and as I come from a relational background, I am not sure about the implications of my structure. I have basically identified 2 possible structure, and would be happy to have some feedbacks, or suggestions of other structure.
{
"id":"1",
"date":"2013-01-01",
"version_date":"2013-01-01",
"value":10
}
{
"id":"1",
"date":"2013-01-01",
"version_date":"2013-01-02",
"value":11
}
{
"id":"1",
"date":"2013-02-01",
"version_date":"2013-02-01",
"value":20
}
{
"id":"1",
"date":"2013-02-01",
"version_date":"2013-02-02",
"value":21
}
{
"id":"1",
"date":"2013-01-01",
"values" : [
{ "version_date":"2013-01-01",
"value":10
},
{
"version_date":"2013-01-02",
"value":11
}
}
{
"id":"1",
"date":"2013-02-01",
"values" : [
{ "version_date":"2013-02-01",
"value":20
},
{
"version_date":"2013-02-02",
"value":21
}
}
In option B, I am also concerned by the fact that it might be a bit more difficult to perform the update query as the document has a growing part, which i am not sure is very well supported by / optimised for mongodb
EDIT: I am also considering option C to speed up query1: (although it might slow down a bit the writing)
{
"id":"1",
"date":"2013-01-01",
"values" : [
{ "version_date":"2013-01-01",
"value":10
},
{
"version_date":"2013-01-02",
"value":11
}
"last_value":11
}
{
"id":"1",
"date":"2013-02-01",
"values" : [
{ "version_date":"2013-02-01",
"value":20
},
{
"version_date":"2013-02-02",
"value":21
}
"last_value":21
}
Upvotes: 1
Views: 2019
Reputation: 10859
There's actually a very recent blog post on the official page covering this topic: http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb Take a look at that and ask any additional questions if required.
Upvotes: 1
Reputation: 222491
As with all questions like this, you are the only person who can answer this. If you have your data - try both way do some benchmarking on real data with real queries and compare what is better. If you do not have data - try to simulate it.
Keep in mind that with option B and C you have to be aware of 16 Mb limit per document. So if you have a lot of versions - you might reach the limit (but you have to understand that a there should be too many versions to reach 16Mb). Also keep in mind that updating such documents can and up with many moves on the disk.
Option B and C would be nice if you would need to select all revisions of a particular document at once, but I have not found this in your most often queries. Keep in mind that with right indexes you can achieve this as well with option A.
Upvotes: 1
Reputation: 5139
Considering the above mentioned Options, and your requirements, it would be best to create your structure based on date
, like you mentioned in Option-B.Also it would be nice if your date
is indexed. Some scenarios (easy reads,updates) that show why this seems to be the proper optimized solution are:
Upvotes: 0