dzm
dzm

Reputation: 23554

MongoDB aggregate include pseudo data

I have a simple aggregate that just returns the a sum of users by status between a certain time frame.

User.aggregate([
  {
    $match: {
      "created" : {
          $gt: startDate,
          $lt: endDate
       } 
    }
  },
  {
    $group: {
      "_id": "$status",
      "count" : {
        $sum: 1
      }
    }
  }
])

What I would like to do is display data for every day within the date range, even if there is no data.

So for example, the result may end up something like this:

[{
 '_id' : '01-15-2015',
  status_counts: {
    'active': 15,
    'inactive': 25,
    'removed': 2
  }
},
{
 '_id' : '01-16-2015',
  status_counts: {
    'active': 0,
    'inactive': 0,
    'removed': 0
  }
},
{
 '_id' : '01-17-2015',
  status_counts: {
    'active': 25,
    'inactive': 5,
    'removed': 1
  }
}]

Any ideas how I could go about doing this? Essentially summing all statuses and grouped by day, but if no data is there, include default data zeroed out?

Example Data:

[{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "active",
    "created" : ISODate("2015-10-11T17:25:46.843Z")
},
{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "inactive",
    "created" : ISODate("2015-10-12T17:25:46.843Z")
},
{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "removed",
    "created" : ISODate("2015-10-12T17:25:46.843Z")
},
{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "active",
    "created" : ISODate("2015-10-14T17:25:46.843Z")
},
{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "active",
    "created" : ISODate("2015-10-14T17:25:46.843Z")
},
{
    "_id" : ObjectId("55413bc29d41675785bf7ed2"),
    "status" : "active",
    "created" : ISODate("2015-10-17T17:25:46.843Z")
}]

Example Result:

[{
    "_id":"10-11-2015",
    "status_counts": {
        "active":1,
        "inactive":0,
        "removed":0
    }
},
{
    "_id":"10-12-2015",
    "status_counts": {
        "active":0,
        "inactive":1,
        "removed":1
    }
},
{
    "_id":"10-13-2015",
    "status_counts": {
        "active":0,
        "inactive":0,
        "removed":0
    }
},
{
    "_id":"10-14-2015",
    "status_counts": {
        "active":2,
        "inactive":0,
        "removed":0
    }
},
{
    "_id":"10-15-2015",
    "status_counts": {
        "active":0,
        "inactive":0,
        "removed":0
    }
},
{
    "_id":"10-16-2015",
    "status_counts": {
        "active":0,
        "inactive":0,
        "removed":0
    }
},
{
    "_id":"10-17-2015",
    "status_counts": {
        "active":1,
        "inactive":0,
        "removed":0
    }
}]

Upvotes: 1

Views: 41

Answers (1)

chridam
chridam

Reputation: 103435

Use the following pipeline where pipeline where the $project stage creates a document with a new property that is composed only by the month-day-year by using the date aggregation operators and the string operators $concat, $subtr operators to do the string manipulation. You can then use the new property in the preceding $group pipeline step as the group key and obtain the respective status counts by using the $cond operator to evaluate the status type and assign the value to $sum. The closing $project pipeline stage then reshapes the final documents to provide the needed fields within a subdocument:

User.aggregate([
    {
        "$match": {
            "created": { "$gt": startDate, "$lt": endDate } 
        }
    },
    {
        "$project": {
           "_id": 0, "status": 1,
           "dayPart": {
                "$concat" : [ 
                    {
                        "$substr": [ {"$month" : "$created"}, 0, 2 ]
                    }, "-",
                    {
                        "$substr": [ {"$dayOfMonth" : "$created"}, 0, 2 ]
                    }, "-",
                    { 
                        "$substr": [ {"$year" : "$created"}, 0, 4 ]
                    }
                ] 
           }
        }
    },
    { 
        "$group": { 
            "_id": "$dayPart",             
            "active_count": {
                "$sum": {
                    "$cond": [ { "$eq": [ "$status", "active" ] }, 1, 0 ]
                }
            },
            "inactive_count": {
                "$sum": {
                    "$cond": [ { "$eq": [ "$status", "inactive" ] }, 1, 0 ]
                }
            },
            "removed_count": {
                "$sum": {
                    "$cond": [ { "$eq": [ "$status", "removed" ] }, 1, 0 ]
                }
            }            
        }  
    },
    {
        "$project": {            
            "status_counts": {
                "active": "$active_count",
                "inactive": "$inactive_count",
                "removed": "$removed_count"
            }
        }
    }
])

Upvotes: 1

Related Questions