Reputation: 1269
The problem I am facing is that how to access the original document after one group operation and also carry the fields after $group in aggregation pipeline in MongoDB.
For example: [ group, unwind, group]
Original document is:
{
"_id" : ObjectId("361de42f1938e89b179dda42"),
"user_id" : ObjectId("9424021bafbde55512e39b83"),
"candidate_id" : ObjectId("54f65356294160421ead3ca1")
"OVERALL_SCORE" : 150,
"SCORES" : [
{ "NAME" : "asd", "OBTAINED_SCORE" : 28}, { "NAME" : "acd", "OBTAINED_SCORE" : 36 }, { "NAME" : "abc", "OBTAINED_SCORE" : 40}
]
}
Aggregation function:
db.coll.aggregate([ $group : { _id : { user_id : "$user_id"}, BEST_SCORE : { $max : "$OVERALL_SCORE"}, AVG_SCORE : { $avg : "$OVERALL_SCORE" }}} ])
Below is the sample output (after 1st group):
{
"result" : [
{
"_id" : {
"user_id" : ObjectId("9424021bafbde55512e39b83")
},
"BEST_SCORE" : 150,
"AVG_SCORE" : 132
}
],
"ok" : 1
}
Question is: (I don't know whether it can be achievable) I want the fields in the original documents (input to aggregation).
For ex: 1) unwind "SCORES" which is in the original document and next group by "candidate_id", "user_id".
2) I want "BEST_SCORE", "AVG_SCORE" (after 1st group) fields to access in the 2nd group also.
Aggregation function should look something like this:
db.coll.aggregate([ $group : { _id : { user_id : "$user_id"}, BEST_SCORE : { $max : "$OVERALL_SCORE"}, AVG_SCORE : { $avg : "$OVERALL_SCORE" }}}, { $unwind : "$SCORES"}, /*problem is--after group operation "SCORES" field which is in original document not available */ { $group : _id : { NAME: "$SCORES.NAME"}, AVG_OBTAINED_SCORE: { $avg : "$SCORES.OBTAINED_SCORE"}} **/*problem is--this is also in the original document*/** ])
Output should look like this:
"BEST_SCORE": 150, //after 1st group
"AVG_SCORE": 132, //after 1st group
"SCORES": [ //problem --- unwind "SCORES" and then group which is actually will not be available after 1st group (get this from original document)
{
"NAME": "abc",
"AVG_OBTAINED_SCORE": 25.5
},
{
"NAME": "asd",
"AVG_OBTAINED_SCORE": 24
},
{
"NAME": "acd",
"AVG_OBTAINED_SCORE": 32
}
]
Could any one please help me out.
Thank you
Upvotes: 2
Views: 4200
Reputation: 151132
When grouping with something that you want to retain the values for all considered documents in a group you need to use $push
. Catch is, that this is an array. So you process $unwind
twice, and also have two $group
stages:
db.coll.aggregate([
{ "$group" : {
"_id": "$user_id",
"BEST_SCORE": { "$max": "$OVERALL_SCORE" },
"AVG_SCORE": { "$avg": "$OVERALL_SCORE" },
"SCORES": { "$push": "SCORES" }
}},
// SCORES in an array of arrays. Unwind twice
{ "$unwind": "$SCORES" },
{ "$unwind": "$SCORES" },
// Group for averages on elements
{ "$group": {
"_id": {
"user_id": "$_id",
"NAME": "$SCORES.name"
},
"BEST_SCORE": { "$first": "$BEST_SCORE" },
"AVG_SCORE": { "$first": "$AVG_SCORE" }
"AVG_OBTAINED_SCORE": { "$avg": "$SCORES.OBTAINED_SCORE" }
}},
// Group to user_id
{ "$group": {
"user_id": "$_id.user_id",
"BEST_SCORE": { "$first": "$BEST_SCORE" },
"AVG_SCORE": { "$first": "$AVG_SCORE" }
"SCORES": { "$push": {
"NAME": "$_id.NAME",
"AVG_OBTAINED_SCORE": "$AVG_OBTAINED_SCORE"
}}
}}
])
You might be tempted to consider using $unwind
before the first $group
, but if you did then the calculated averages would be affected by the number of elements present in the array that was being "un-wound". So the "double $unwind" is a necessary process here.
Upvotes: 4