Matt Lightbourn
Matt Lightbourn

Reputation: 597

Sum array number values across multiple documents

I have Mongo documents which have array number values in order (it's by day) and I want to sum the same values across multiple documents for each position grouped by field outside of the array.

{"_id" : "1",
 "group" : "A",
 "value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "2",
 "group" : "B",
 "value_list" : [10,20,30,40,50,60,70]
},
{"_id" : "3",
 "group" : "A",
 "value_list" : [1,2,3,4,5,6,7]
},
{"_id" : "4",
 "group" : "B",
 "value_list" : [10,20,30,40,50,60,70]
}

So the results I'm after is listed below.

There are two group A documents above and at position 1 of the value_list array, both documents have the value of 1. so 1+1=2. Position 2 the value is 2 in both documents so 2+2=4, etc.

There are two group B documents above and at position 1 of the value_list array, both documents have the value of 10. so 10+10=20. Position 2 the value is 20 in both documents so 20+20=40, etc.

{"_id" : "30",
 "group" : "A",
 "value_list" : [2,4,6,8,10,12,14]
},
{"_id" : "30",
 "group" : "A",
 "value_list" : [20,40,60,80,100,120,140]
}

How would I do this using Mongo Script? Thanks, Matt

Upvotes: 0

Views: 210

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151092

Certainly the most "scalable" way is to use the includeArrayIndex option of $unwind in order to track the positions and then $sum the "unwound" combinations, before adding back into array format:

db.getCollection('test').aggregate([
  { "$unwind": { "path": "$value_list", "includeArrayIndex": "index" } },
  { "$group": {
    "_id": {
      "group": "$group",
      "index": "$index"
    },
    "value_list": { "$sum": "$value_list" }
  }},
  { "$sort": { "_id": 1 } },
  { "$group": {
      "_id": "$_id.group",
      "value_list": { "$push": "$value_list" }
  }},
  { "$sort": { "_id": 1 } }  
])

Note you need to $sort after the first $group in order to maintain the array positions.

If you can get away with it, you could also apply all arrays into $reduce:

db.getCollection('test').aggregate([
  { "$group": {
    "_id": "$group",
    "value_list": { "$push": "$value_list" }  
  }},
  { "$addFields": {
    "value_list": {
      "$reduce": {
        "input": "$value_list",
        "initialValue": [],
        "in": {
          "$map": {
            "input": {
              "$zip": {
                "inputs": ["$$this", "$$value"],
                "useLongestLength": true,
              }
            },
            "in": { "$sum": "$$this"}
          }
        }         
      } 
    }  
  }},
  { "$sort": { "_id": 1 } }
])

Essentially you create an "array of arrays" using the initial $push, which you process with $reduce. The $zip does a "pairwise" assignment per element, which are then added together at each position during $map using $sum.

While a bit more efficient, it's not really practical for large data as you would probably break the BSON limit by adding all grouped "arrays" into a single array on the grouping, before you "reduce" it.

Either method produces the same result:

/* 1 */
{
    "_id" : "A",
    "value_list" : [ 
        2.0, 
        4.0, 
        6.0, 
        8.0, 
        10.0, 
        12.0, 
        14.0
    ]
}

/* 2 */
{
    "_id" : "B",
    "value_list" : [ 
        20.0, 
        40.0, 
        60.0, 
        80.0, 
        100.0, 
        120.0, 
        140.0
    ]
}

Upvotes: 1

Related Questions