whitecat999
whitecat999

Reputation: 3

mongodb: Aggregate hourly data to bi-hourly aggregates

I have a hourly report in mongodb which has some data for each hour. Now I want to get bi-hourly report from it meaning that it will have the sum of field "count" and "value" from every two hours. How to do the aggregation? Thanks a lot!

Before, hourly data:

/* 1 */
{
    "count" : 63713,
    "value" : 46151,
    "timestamp" : ISODate("2014-09-17T18:59:04.247+03:00"),
}

/* 2 */
{
    "count" : 63743,
    "value" : 48327,
    "timestamp" : ISODate("2014-09-17T19:59:04.281+03:00"),
}

/* 3 */
{
    "count" : 63761,
    "value" : 51650,
    "timestamp" : ISODate("2014-09-17T20:59:04.295+03:00"),
}

/* 4 */
{
    "count" : 63756,
    "value" : 52865,
    "timestamp" : ISODate("2014-09-17T21:59:04.298+03:00"),
}

After, bi-hourly data:

/* sum of documents 1&2 */

{
    "count" : 117456,
    "value" : 94478,
    "timestamp" : ISODate("2014-09-17T18:59:04.247+03:00"),
}

/* sum of documents 3&4 */

{
    "count" : 127517,
    "value" : 104515,
    "timestamp" : ISODate("2014-09-17T20:59:04.295+03:00"),
}

Upvotes: 0

Views: 1720

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151132

Actually your "bi-hourly" data in a day would cover three time periods from the sample as given. So Document 1 is in the first of a two hour block, 2 & 3 are in the second and 4 is in the third.

So you can really just apply some take math here to get 12 two hour intervals within a day:

db.times.aggregate([
    { "$group": {
        "_id": {
            "$subtract": [
                { "$subtract": [ "$timestamp", new Date("1970-01-01") ] },
                { "$mod": [
                    { "$subtract": [ "$timestamp", new Date("1970-01-01") ] },
                    1000 * 60 * 60 * 2
                ]}
            ],
        },
        "count": { "$sum": "$count" },
        "value": { "$sum": "$value" }
    }},
    { "$sort": { "_id": 1 } }
])

Which would produce a timestamp value representing the date at two hour intervals. Or you could just use the date aggregation operators instead:

db.times.aggregate([
    { "$group": {
        "_id": {
            "day": { "$dayOfYear": "$timestamp" },
            "hour": {
                "$subtract": [
                    { "$hour": "$timestamp" },
                    { "$mod": [ { "$hour": "$timestamp" }, 2 ] }
                ]
            }
        },
        "count": { "$sum": "$count" },
        "value": { "$sum": "$value" }
    }},
    { "$sort": { "_id": 1 } }
])

Upvotes: 1

Related Questions