okoboko
okoboko

Reputation: 4482

Group By Condition in MongoDB

I have a series of documents (check events) in MongoDB that look like this:

{
    "_id" : ObjectId("5397a78ab87523acb46f56"),
    "inspector_id" : ObjectId("5397997a02b8751dc5a5e8b1"),
    "status" : 'defect',
    "utc_timestamp" : ISODate("2014-06-11T00:49:14.109Z")
}

{
    "_id" : ObjectId("5397a78ab87523acb46f57"),
    "inspector_id" : ObjectId("5397997a02b8751dc5a5e8b2"),
    "status" : 'ok',
    "utc_timestamp" : ISODate("2014-06-11T00:49:14.109Z")
}

I need to get a result set that looks like this:

[
  {
    "date" : "2014-06-11",
    "defect_rate" : '.92' 
  },  
  {
    "date" : "2014-06-11",
    "defect_rate" : '.84' 
  }, 
]

In other words, I need to get the average defect rate per day. Is this possible?

Upvotes: 20

Views: 51348

Answers (2)

headliner090
headliner090

Reputation: 119

Would this be the best approach for the updated question?

  $group: {
    _id: "$symbol",
    amount: { $sum: "$amount" },
    value: { $sum: "$value" },
    occurences: { $sum: 1 },
    in: {
      $sum: {
        $cond: [{ $eq: ["$direction", "IN"] }, 1, 0],
      },
    },
    out: {
      $sum: {
        $cond: [{ $eq: ["$direction", "OUT"] }, 1, 0],
      },
    },
  }

Upvotes: 4

Neil Lunn
Neil Lunn

Reputation: 151072

The aggregation framework is what you want:

db.collection.aggregate([
    { "$group": {
        "_id": {
            "year": { "$year": "$utc_timestamp" },
            "month": { "$month": "$utc_timestamp" },
            "day": { "$dayOfMonth": "$utc_timestamp" },
        },
        "defects": {
            "$sum": { "$cond": [
                { "$eq": [ "$status", "defect" ] },
                1,
                0
            ]}
        },
        "totalCount": { "$sum": 1 }
    }},
    { "$project": {
        "defect_rate": {
            "$cond": [
                { "$eq": [ "$defects", 0 ] },
                0,
                { "$divide": [ "$defects", "$totalCount" ] }
            ]
        }
    }}
])

So first you group on the day using the date aggregation operators and get the totalCount of items on the given day. The use of the $cond operator here determines whether the "status" is actually a defect or not and the result is a conditional $sum where only the "defect" values are counted.

Once those are grouped per day you simply $divide the result, with another check with $cond to make sure you are not dividing by zero.

Upvotes: 38

Related Questions