kaoscify
kaoscify

Reputation: 1763

Aggregate and Count Items in a Result from MongoDB using pymongo & Flask

I am trying to return a simple JSON result from MongoDB using pymongo in a Flask environment. Here's what I am doing:

def myResults():
    myCollection = db["my_data"]
    results = list(myCollection.find({},{"ID":1,"Response":1,"_id":0}))
    return jsonify(results=results)

When I do this, I get the following result. FYI "ID" is basically a made-up unique identifier.

{
  "result": [
    {
      "ID": 1, 
      "Response": "A"
    }, 
    {
      "ID": 4, 
      "Response": "B"
    }, 
    {
      "ID": 3, 
      "Response": "A"
    },
  ]
} // and so on...

I want to aggregate all the responses for a particular ID and show the count. Something that would look like this I guess (or if there's a better way):

{
  "result": [
    {
      "ID": 1, 
      "A": 2,
      "B": 1,
      "C": 5,
      "Total": 8
    }, 
    {
      "ID": 4, 
      "A": 0,
      "B": 5,
      "C": 18,
      "Total": 23
    }, 
    {
      "ID": 3, 
      "A": 12,
      "B": 6,
      "C": 8,
      "Total": 26
    },
  ]
}

Someone on StackOverflow recommend using aggregate(). But it's not really working for me.

allResults = surveyCollection.aggregate([
    {"$unwind": result }, 
    {"$group": {"_id": result.ID, "A": {"$sum": 1}, "B": {"$sum": 1}}},
    {"$group": {"_id": None, "result": {"$push": {"ID": "$ID", "A": "$A", "B": "$B"}}}}
])

return jsonify(allResults)

I get an error: AttributeError: 'list' object has no attribute 'ID'.

Is there no simpler method by using just find() or count()?

Upvotes: 1

Views: 3798

Answers (2)

Blakes Seven
Blakes Seven

Reputation: 50426

The aggregation framework refers to all field names with a $ prefix. This is not "python" code, but "aggregation pipeline" operations, such it's just a data structure to be translated into BSON.

allResults = surveyCollection.aggregate([
    { "$group": {
        "_id": {
            "ID": "$ID",
            "type": "$Response"
        },
        "count": { "$sum": 1 },
    }},
    { "$group": {
        "_id": "$_id.ID",
        "A": { "$sum": { "$cond": [{ "$eq": [ "$_id.type", "A" ]}, "$count", 0 ] } },
        "B": { "$sum": { "$cond": [{ "$eq": [ "$_id.type", "B" ]}, "$count", 0 ] } },
        "C": { "$sum": { "$cond": [{ "$eq": [ "$_id.type", "C" ]}, "$count", 0 ] } },
        "Total": { "$sum": "$count" }
    }}
])

return jsonify(allResults)

This produces on the three documents you gave us, this result:

{ "_id" : 1, "A" : 1, "B" : 0, "C" : 0, "Total" : 1 }
{ "_id" : 4, "A" : 0, "B" : 1, "C" : 0, "Total" : 1 }
{ "_id" : 3, "A" : 1, "B" : 0, "C" : 0, "Total" : 1 }

So you started to do the right thing on the end, but got a bit off track in the middle.

What you cannot do here is "dynanically" create fields in the aggregation framework, so you need to specify all of the required "properties" and test with the conditions as shown. If you cannot live with that, then you need mapReduce for this format. Personally I would do this which is simpler:

allResults = surveyCollection.aggregate([
    { "$group": {
        "_id": {
            "ID": "$ID",
            "type": "$Response"
        },
        "count": { "$sum": 1 }
    }},
    { "$group": {
        "_id": "$_id.ID",
        "results": { 
          "$push": {
            "type": "$_id.type", 
            "count": "$count"
          }
        },
        "Total": { "$sum": "$count" }
    }}
])

return jsonify(allResults)

Different output format, but the same basic information and simple:

{ "_id" : 1, "results" : [ { "type" : "A", "count" : 1 } ], "Total" : 1 }
{ "_id" : 4, "results" : [ { "type" : "B", "count" : 1 } ], "Total" : 1 }
{ "_id" : 3, "results" : [ { "type" : "A", "count" : 1 } ], "Total" : 1 }

BTW. This actually looks like it came from an aggregation result, so it might be time to revisit the code that created the source.

Upvotes: 2

Justin Giorgi
Justin Giorgi

Reputation: 11

Replace result.ID with "result.ID". Without the quotes python is going to look for the variable result.ID (which doesn't exist) instead of sending the server the literal string "result.ID".

Upvotes: 0

Related Questions