bastianwegge
bastianwegge

Reputation: 2498

How can one find subsequent results using MongoDB aggregation pipeline?

Imagine the following:

{
    "user" : "john",
    "type" : "connect",
    "created" : ISODate("2015-10-02T10:00:00.000Z"),
    "__v" : 0
},
{
    "user" : "john",
    "type" : "disconnect",
    "created" : ISODate("2015-10-02T10:10:00.000Z"),
    "__v" : 0
},
{
    "user" : "frank",
    "type" : "connect",
    "created" : ISODate("2015-10-02T10:05:00.000Z"),
    "__v" : 0
},
{
    "user" : "frank",
    "type" : "disconnect",
    "created" : ISODate("2015-10-02T10:15:00.000Z"),
    "__v" : 0
},
{
    "user" : "john",
    "type" : "connect",
    "created" : ISODate("2015-10-02T10:15:00.000Z"),
    "__v" : 0
}

I want to create an outage-report telling "user john had 5 minutes of downtime". I'm completely in the dark right now, I digged into aggregation aswell as mapReduce but nothing seems to point in the direction where I need it. I could solve it using plain javascript but I want to avoid that since MongoDB is so made for aggregation of those kinds. Maybe I'm just stuck in my head and need to let it rest for some while, but maybe someone has a good solution for me.

So best output would be (I guess):

{
    "user" : "john",
    "outage": "5"
}

Besides this "optimal" example, there may be cases where disconnects are swallowed by the system when deploying a new server version. Those are about 10 seconds and I'm thinking about leaving them out for the sake of ease.

Upvotes: 1

Views: 116

Answers (1)

Rohit Jain
Rohit Jain

Reputation: 2092

Query

db.collection.aggregate([{$group:{_id:{user:"$user",type:"$type"},created:{$max:"$created"}}},{$sort:{"_id.user":1,"_id.type":-1}},{$group:{_id:"$_id.user","latest_disconnected":{"$first":"$created"},"latest_connected":{$last:"$created"}}},{$project:{outage:{$divide:[{"$subtract": [ "$latest_connected", "$latest_disconnected" ] },60000]}}}])

Output

{ "_id" : "john", "outage" : 5 }

{ "_id" : "frank", "outage" : -10 }

Here is the details explanation

This will give you the latest entry of each user for type connected and disconnected

{$group:{_id:{user:"$user",type:"$type"},created:{$max:"$created"}}}

Don't forget to sort by user and type, group does not guaranteed of sorted output

{$sort:{"_id.user":1,"_id.type":-1}}

As each user will have 2 entry from previous pipeline, so first entry will be for latest disconnect and last(second) entry for latest connect

{$group:{_id:"$_id.user","latest_disconnected":{"$first":"$created"},"latest_connected":{$last:"$created"}}}

This is simple mathematics, difference between 2 dates and converted into min

{$project:{outage:{$divide:[{"$subtract": [ "$latest_connected", "$latest_disconnected" ] },60000]}}}

Hope it will help

Upvotes: 1

Related Questions