Reputation: 1857
I'm using MongoDB to store time series data using a similar structure to "The Document-Oriented Design" explained here: http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb
The objective is to query for the top 10 busiest minutes of the day on the whole system. Each document stores 1 hour of data using 60 sub-documents (1 for each minute). Each minute stores various metrics embedded in the "vals" field. The metric I care about is "orders". A sample document looks like this:
{
"_id" : ObjectId("54d023802b1815b6ef7162a4"),
"user" : "testUser",
"hour" : ISODate("2015-01-09T13:00:00Z"),
"vals" : {
"0" : {
"orders" : 11,
"anotherMetric": 15
},
"1" : {
"orders" : 12,
"anotherMetric": 20
},
.
.
.
}
}
Note there are many users in the system.
I've managed to flatten the structure (somewhat) by doing an aggregate with the following group object:
group = {
$group: {
_id: {
hour: "$hour"
},
0: {$sum: "$vals.0.orders"},
1: {$sum: "$vals.1.orders"},
2: {$sum: "$vals.2.orders"},
.
.
.
}
}
But that just gives me 24 documents (1 for each hour) with the # of orders for each minute during that hour, like so:
{
"_id" : {
"hour" : ISODate("2015-01-20T14:00:00Z")
},
"0" : 282086,
"1" : 239358,
"2" : 289188,
.
.
.
}
Now I need to somehow get the top 10 minutes of the day from this but I'm not sure how. I suspect it can be done with $project, but I'm not sure how.
Upvotes: 5
Views: 6950
Reputation: 19700
You could aggregate as:
$match
the documents for the specific date.$group
and $project
objects before querying.$group
by the $hour
, accumulate all the documents per hour per
minute in an array.Keep the minute somewhere within the document.$project
a variable docs as $setUnion
of all the documents per
hour.$unwind
the documents.$sort
by orders
$limit
the top 10
documents which is what we require.Code:
var inputDate = new ISODate("2015-01-09T13:00:00Z");
var group = {};
var set = [];
for(var i=0;i<=60;i++){
group[i] = {$push:{"doc":"$vals."+i,
"hour":"$_id.hour",
"min":{$literal:i}}};
set.push("$"+i);
}
group["_id"] = {$hour:"$hour"};
var project = {"docs":{$setUnion:set}}
db.t.aggregate([
{$match:{"hour":{$lte:inputDate,$gte:inputDate}}},
{$group:group},
{$project:project},
{$unwind:"$docs"},
{$sort:{"docs.doc.orders":-1}},
{$limit:2},
{$project:{"_id":0,
"hour":"$_id",
"doc":"$docs.doc",
"min":"$docs.min"}}
])
Upvotes: 6