armnotstrong
armnotstrong

Reputation: 9065

Aggregate mongodb records with self-defined time period

I have a Collection that contains some event reported by client, such as:

{ "_id" : ObjectId("54f43159c922ac0b4387ef9c"), "appversion" : "v1.2", "appid" : "930370913", "clkip" : "", "actip" : "", "clktime" : 1425289561, "acttime" : 0, "platform" : "google", "isnotified" : false, "idfa" : "14A900D9-A61A-41DC-A327-96EBE4BA57B31" }
{ "_id" : ObjectId("54f43159c922ac0b4387ef9d"), "appversion" : "v1.2", "appid" : "930370913", "clkip" : "", "actip" : "", "clktime" : 1425289561, "acttime" : 0, "platform" : "google", "isnotified" : false, "idfa" : "14A900D9-A61A-41DC-A327-96EBE4BA57B32" }
{ "_id" : ObjectId("54f43159c922ac0b4387ef9e"), "appversion" : "v1.2", "appid" : "930370913", "clkip" : "", "actip" : "", "clktime" : 1425289561, "acttime" : 0, "platform" : "facebook", "isnotified" : false, "idfa" : "14A900D9-A61A-41DC-A327-96EBE4BA57B33" }
{ "_id" : ObjectId("54f43159c922ac0b4387ef9f"), "appversion" : "v1.2", "appid" : "930370913", "clkip" : "", "actip" : "", "clktime" : 1425289561, "acttime" : 0, "platform" : "google", "isnotified" : false, "idfa" : "14A900D9-A61A-41DC-A327-96EBE4BA57B34" }

You can see that clktime is a unix timestamp (self defined, not the one Mongodb generated) with the precision of Second, I want to know how many events per 5 mins each paltform has reported(by clktime), I know I should use mongodb's Aggregate framework such as:

 db.event.aggregate([{$match:{clktime:{$gt:1425204775}}},{$group:{_id:???, count:{$sum:1}}}])
                                                                      ^^^
                                                                       I really don't know what this _id should be.

But I don't know how to define the _id of $group :-(

The output I want to achieve is like this:

{ "_id" : 0, "time":1425207775, "count" : 100 }
{ "_id" : 0, "time":1425210775, "count" : 51 }
{ "_id" : 0, "time":1425213775, "count" : 51 }

It will be nicer if the platform information could be identified too. But if it's too complex, You can give some reference and I will dig into it myself.

Any suggestion will be appreciate.

Upvotes: 2

Views: 110

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151172

Not a problem really and not too hard. You just need "date math" to work with the "5 minute intervals" that you describe since this is a "number" and not a "date" value. It's still possible with "Date" objects ( which you should really be using because there is little to no overhead and not much difference in processing ) but let's stick to the point:

db.event.aggregate([
    { "$match": { "clktime":{ "$gt": 1425204775 } } },
    { "$group": {
        "_id": {
            "$subtract": [
                "$clktime",
                "$mod": [ "$clktime",  60 * 5 ]   // 5 minutes in seconds
            ]
        },
        "count": { "$sum": 1 }
    }}
])

Rounding out the values to 5 minute intervals gets the grouping data you want in the _id grouping key.

Also the _id value is the "grouping key", so your expected result is not valid and it can only be something that is "uniquely grouped" upon. This is not really any different to SQL "GROUP BY" if you are familiar with that.

Upvotes: 3

Related Questions