Dan Ancona
Dan Ancona

Reputation: 162

How to sum inside records via mongodb aggregate functions

Given some data like this:

[
  {issue: "taxes", resp: "agree"},
  {issue: "taxes", resp: "disagree"},
  {issue: "taxes", resp: "disagree"},
  {issue: "sexism", resp: "disagree"},
  {issue: "sexism", resp: "disagree"},
]

I'd like to use Mongo aggregate functions to get me to output exactly like this:

{issue: "taxes", agree: 1, disagree: 2, tot: 3}
{issue: "sexism", agree: 0, disagree: 2, tot: 2}

I've tried this:

db.responses.aggregate(
   [  
      {
        $group : {
           _id : { issue: "$issue", resp: "$resp" },
           count: { $sum: 1 },
        }
      },
   ]
)

Which gets me close, but not quite:

"result" : [ 
    {
        "_id" : {
            "issue" : "racism",
            "resp" : "agree"
        },
        "count" : 3
    }, 
    {
        "_id" : {
            "issue" : "racism",
            "resp" : "disagree"
        },
        "count" : 3
    },

I could parse the data once it comes back with a loop and some ugly counters, and for the amount of data I've got it would be no big deal. But I'm curious and trying to get better at Mongo aggregation. I think I need a $project in the pipeline but I can not figure it out!

I am also curious about the scalability of this, up to say 20k to 50k response records or so, for a typical hosted mongo configuration. It wouldn't be hard for me to create a summary document and just update the totals for agree/disagree as user input comes in. The result I'm thinking of here would be my desired output plus a candidate/etc field for lookup. Maybe that's a more mongo-ish way of doing things? Any thoughts appreciated.

Upvotes: 1

Views: 163

Answers (1)

Blakes Seven
Blakes Seven

Reputation: 50406

Tricky, but very doable. What you basically need to do is make a "conditional sum" instead using $cond for each result count:

{ "$group": {
    "_id": "$issue",
    "agree": { "$sum": { "$cond": [ { "$eq": [ "$resp", "agree" ] }, 1, 0 ] } },
    "disagree": { "$sum": { "$cond": [ { "$eq": [ "$resp", "disagree" ] }, 1, 0 ] } },
    "total": { "$sum": 1 }
}}

So it's just a test to see what the current value of "resp" is in the document and only increment the count when it is a match.

Note you could trim this down by basically storing true/false in the document as a value instead.

Upvotes: 3

Related Questions