Reputation: 3433
How flexible is the aggregate function for output formatting in MongoDB?
Data format:
{
"_id" : ObjectId("506ddd1900a47d802702a904"),
"port_name" : "CL1-A",
"metric" : "772.0",
"port_number" : "0",
"datetime" : ISODate("2012-10-03T14:03:00Z"),
"array_serial" : "12345"
}
Right now I'm using this aggregate function to return an array of DateTime, an array of metrics, and a count:
{$match : { 'array_serial' : array,
'port_name' : { $in : ports},
'datetime' : { $gte : from, $lte : to}
}
},
{$project : { port_name : 1, metric : 1, datetime: 1}},
{$group : { _id : "$port_name",
datetime : { $push : "$datetime"},
metric : { $push : "$metric"},
count : { $sum : 1}}}
Which is nice, and very fast, but is there a way to format the output so there's one array per datetime/metric? Like this:
[
{
"_id" : "portname",
"data" : [
["2012-10-01T00:00:00.000Z", 1421.01],
["2012-10-01T00:01:00.000Z", 1361.01],
["2012-10-01T00:02:00.000Z", 1221.01]
]
}
]
This would greatly simplify the front-end as that's the format the chart code expects.
Upvotes: 15
Views: 21188
Reputation: 981
The following isn't conditional, but easier to understand.
{"_id":"$city","doc":{"$push":"$$ROOT"}}
Upvotes: 0
Reputation: 50406
MongoDB 2.6 made this a lot easier by introducing $map
, which allows a simplier form of array transposition:
db.metrics.aggregate([
{ "$match": {
"array_serial": array,
"port_name": { "$in": ports},
"datetime": { "$gte": from, "$lte": to }
}},
{ "$group": {
"_id": "$port_name",
"data": {
"$push": {
"$map": {
"input": [0,1],
"as": "index",
"in": {
"$cond": [
{ "$eq": [ "$$index", 0 ] },
"$datetime",
"$metric"
]
}
}
}
},
"count": { "$sum": 1 }
}}
])
Where much like the approach with $unwind
, you supply an array as "input" to the map operation consisting of two values and then essentially replace those values with the field values you want via the $cond
operation.
This actually removes all the pipeline juggling required to transform the document as was required in previous releases and just leaves the actual aggregation to the job at hand, which is basically accumulating per "port_name" value, and the transformation to array is no longer a problem area.
Upvotes: 2
Reputation: 65323
Combining two fields into an array of values with the Aggregation Framework is possible, but definitely isn't as straightforward as it could be (at least as at MongoDB 2.2.0).
Here is an example:
db.metrics.aggregate(
// Find matching documents first (can take advantage of index)
{ $match : {
'array_serial' : array,
'port_name' : { $in : ports},
'datetime' : { $gte : from, $lte : to}
}},
// Project desired fields and add an extra $index for # of array elements
{ $project: {
port_name: 1,
datetime: 1,
metric: 1,
index: { $const:[0,1] }
}},
// Split into document stream based on $index
{ $unwind: '$index' },
// Re-group data using conditional to create array [$datetime, $metric]
{ $group: {
_id: { id: '$_id', port_name: '$port_name' },
data: {
$push: { $cond:[ {$eq:['$index', 0]}, '$datetime', '$metric'] }
},
}},
// Sort results
{ $sort: { _id:1 } },
// Final group by port_name with data array and count
{ $group: {
_id: '$_id.port_name',
data: { $push: '$data' },
count: { $sum: 1 }
}}
)
Upvotes: 17
Reputation: 33155
Building arrays in the aggregation framework without $push and $addToSet is something that seems to be lacking. I've tried to get this to work before, and failed. It would be awesome if you could just do:
data : {$push: [$datetime, $metric]}
in the $group
, but that doesn't work.
Also, building "literal" objects like this doesn't work:
data : {$push: {literal:[$datetime, $metric]}}
or even data : {$push: {literal:$datetime}}
I hope they eventually come up with some better ways of massaging this sort of data.
Upvotes: 1