Olaf Erlandsen
Olaf Erlandsen

Reputation: 6036

Calculate Average Duration from a Series of Dates

Well, I have one collection with too much documents(millions), and now i wanna know how I can calculate average of session duration.

This is a example of structure of my docs:

{
    "session" : "xyz",
    "date" : {"$date" : ""Tue May 15 2018 10:35:08 GMT-0400 LT""}
}
{
    "session" : "xyz",
    "date" : {"$date" : ""Tue May 15 2018 12:35:08 GMT-0400 LT""}
}
{
    "session" : "xyz",
    "date" : {"$date" : ""Tue May 15 2018 15:35:08 GMT-0400 LT""}
}
{
    "session" : "abc",
    "date" : {"$date" : ""Tue May 15 2018 09:35:08 GMT-0400 LT""}
}
{
    "session" : "abc",
    "date" : {"$date" : ""Tue May 15 2018 11:35:08 GMT-0400 LT""}
}

And I wanna define a algorithm like this:

  1. Get all documents by "$session"
  2. Calculate min(first) date and max(last) date of these(2nd point) documents.
  3. Calculate duration(time difference) by session
  4. Repat first 1st point with another "$session"
  5. after calculate all average by "$session", calculate average of these total.

I think first need using aggregate with $group, but I dont have experience with that and now I have a existencial problem hahaha

Thanks

Upvotes: 0

Views: 534

Answers (1)

Neil Lunn
Neil Lunn

Reputation: 151190

As long as "sequence" is not important here then $min and $max are the right things:

db.collection.aggregate([
  { "$group": {
    "_id": "$session",
    "minDate": { "$min": "$date" },
    "maxDate": { "$max": "$date" }
  }},
  { "$group": {
    "_id": null,
    "avg": {
      "$avg": { "$subtract": [ "$maxDate", "$minDate" ] }
    }
  }}
])

The initial $group obtains the $min and $max values from the "grouping key" provided which is the "session" value in the document. The the second $group is done without any grouping key ( or null ) in order to accumulate "everything".

When you $subtract one BSON Date from the other, then the result is the difference in milliseconds. This is also a handing way of casting to a numeric value in general. But all you do is $subtract the two values output from $min and $max and feed that to the $avg accumulator.

Upvotes: 1

Related Questions