Reputation: 597
I have Mongo documents which have array number values in order (it's by day) and I want to sum the same values across multiple documents for each position grouped by field outside of the array.
{"_id" : "1",
"group" : "A",
"value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "2",
"group" : "B",
"value_list" : [10,20,30,40,50,60,70]
},
{"_id" : "3",
"group" : "A",
"value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "4",
"group" : "B",
"value_list" : [10,20,30,40,50,60,70]
}
So the results I'm after is listed below.
There are two group A documents above and at position 1 of the value_list array, both documents have the value of 1. so 1+1=2. Position 2 the value is 2 in both documents so 2+2=4, etc.
There are two group B documents above and at position 1 of the value_list array, both documents have the value of 10. so 10+10=20. Position 2 the value is 20 in both documents so 20+20=40, etc.
{"_id" : "30",
"group" : "A",
"value_list" : [2,4,6,8,10,12,14]
},
{"_id" : "30",
"group" : "A",
"value_list" : [20,40,60,80,100,120,140]
}
How would I do this using Mongo Script? Thanks, Matt
Upvotes: 0
Views: 210
Reputation: 151092
Certainly the most "scalable" way is to use the includeArrayIndex
option of $unwind
in order to track the positions and then $sum
the "unwound" combinations, before adding back into array format:
db.getCollection('test').aggregate([
{ "$unwind": { "path": "$value_list", "includeArrayIndex": "index" } },
{ "$group": {
"_id": {
"group": "$group",
"index": "$index"
},
"value_list": { "$sum": "$value_list" }
}},
{ "$sort": { "_id": 1 } },
{ "$group": {
"_id": "$_id.group",
"value_list": { "$push": "$value_list" }
}},
{ "$sort": { "_id": 1 } }
])
Note you need to $sort
after the first $group
in order to maintain the array positions.
If you can get away with it, you could also apply all arrays into $reduce
:
db.getCollection('test').aggregate([
{ "$group": {
"_id": "$group",
"value_list": { "$push": "$value_list" }
}},
{ "$addFields": {
"value_list": {
"$reduce": {
"input": "$value_list",
"initialValue": [],
"in": {
"$map": {
"input": {
"$zip": {
"inputs": ["$$this", "$$value"],
"useLongestLength": true,
}
},
"in": { "$sum": "$$this"}
}
}
}
}
}},
{ "$sort": { "_id": 1 } }
])
Essentially you create an "array of arrays" using the initial $push
, which you process with $reduce
. The $zip
does a "pairwise" assignment per element, which are then added together at each position during $map
using $sum
.
While a bit more efficient, it's not really practical for large data as you would probably break the BSON limit by adding all grouped "arrays" into a single array on the grouping, before you "reduce" it.
Either method produces the same result:
/* 1 */
{
"_id" : "A",
"value_list" : [
2.0,
4.0,
6.0,
8.0,
10.0,
12.0,
14.0
]
}
/* 2 */
{
"_id" : "B",
"value_list" : [
20.0,
40.0,
60.0,
80.0,
100.0,
120.0,
140.0
]
}
Upvotes: 1