Reputation: 6036
Well, I have one collection with too much documents(millions), and now i wanna know how I can calculate average of session duration.
This is a example of structure of my docs:
{
"session" : "xyz",
"date" : {"$date" : ""Tue May 15 2018 10:35:08 GMT-0400 LT""}
}
{
"session" : "xyz",
"date" : {"$date" : ""Tue May 15 2018 12:35:08 GMT-0400 LT""}
}
{
"session" : "xyz",
"date" : {"$date" : ""Tue May 15 2018 15:35:08 GMT-0400 LT""}
}
{
"session" : "abc",
"date" : {"$date" : ""Tue May 15 2018 09:35:08 GMT-0400 LT""}
}
{
"session" : "abc",
"date" : {"$date" : ""Tue May 15 2018 11:35:08 GMT-0400 LT""}
}
And I wanna define a algorithm like this:
I think first need using aggregate
with $group, but I dont have experience with that and now I have a existencial problem hahaha
Thanks
Upvotes: 0
Views: 534
Reputation: 151190
As long as "sequence" is not important here then $min
and $max
are the right things:
db.collection.aggregate([
{ "$group": {
"_id": "$session",
"minDate": { "$min": "$date" },
"maxDate": { "$max": "$date" }
}},
{ "$group": {
"_id": null,
"avg": {
"$avg": { "$subtract": [ "$maxDate", "$minDate" ] }
}
}}
])
The initial $group
obtains the $min
and $max
values from the "grouping key" provided which is the "session"
value in the document. The the second $group
is done without any grouping key ( or null
) in order to accumulate "everything".
When you $subtract
one BSON Date from the other, then the result is the difference in milliseconds. This is also a handing way of casting to a numeric value in general. But all you do is $subtract
the two values output from $min
and $max
and feed that to the $avg
accumulator.
Upvotes: 1