TomK
TomK

Reputation: 363

Need max value in a MongoDB collection, and its timestamp

Consider a collection consisting of documents of the following form:

{
"_id" : ObjectId("55f3600da9fb6e4f937a50a7"),
"timestamp" : ISODate("2010-01-01T08:10:00Z"),
"temperature" : 12.31
}
{
"_id" : ObjectId("55f3600da9fb6e4f937a50a8"),
"timestamp" : ISODate("2010-01-01T08:15:00Z"),
"temperature" : 12.48
}
...

I want to find the max temperature for each day of the year, and the time it occurred. The first part is easy enough with a pipeline aggregation:

[{"$group" : {"_id" : {"day": { "$dayOfYear": "$timestamp" }},
              "max_temperature": {"$max" : "$temperature"}}},
 {"$sort" : {"_id.day":1}}]

This gives me a nice result set with the max temperature for each day:

{u'max_temperature': 20.98, u'_id': {u'day': 1}}
{u'max_temperature': 24.15, u'_id': {u'day': 2}}
{u'max_temperature': 22.02, u'_id': {u'day': 3}}
...

But, how do I get the timestamp of when each daily maximum occurred? Something like:

{u'max_temperature': 20.98, u'time_of_max': ISODate("2010-01-01T15:11:12"), u'_id': {u'day': 1}}
{u'max_temperature': 24.15, u'time_of_max': ISODate("2010-01-02T16:03:42"), u'_id': {u'day': 2}}
{u'max_temperature': 22.02, u'time_of_max': ISODate("2010-01-03T16:33:59"), u'_id': {u'day': 3}}
...

Upvotes: 3

Views: 2427

Answers (2)

Rohit Jain
Rohit Jain

Reputation: 2092

db.temp.aggregate([{$project:{date:{$dayOfYear:"$timestamp"},temp:"$temperatur e",ISODate:"$timestamp"}},{$group:{_id:"$date",maxtemp:{"$max":"$temp"},timestam p:{"$first":"$ISODate"}}},{"$sort":{_id:1}}])

Upvotes: 0

Blakes Seven
Blakes Seven

Reputation: 50406

Use $sort first and the $first operator instead of $max. But the catch is you need the "day" detail reduction to be present first:

[
    { "$project": {
        "day": { "$dayOfYear": "$timestamp" },
        "timestamp": 1,
        "temperature": 1
    }},
    { "$sort": { "day": 1, "temperature": -1 } },
    { "$group": {
        "_id" : "$day",
        "max_temperature": { "$first": "$temperature" },
        "timestamp": { "$first": "$timestamp" }
    }},
    { "$sort": { "_id":1 } }
]

Once you have that projected value for "day" then the input can be ordered by day with the largest "temperature" value first. Then the $first grouping operator will pick fields from the "first" document found on the grouping boundary.

So the "temperature" is the "maximum" because of the sort order, and the other fields will be from the same document where that value occurs.

Technically, this should still work:

[
    { "$sort": { "temperature": -1, "timestamp": 1 } },
    { "$group": {
        "_id" : { "$dayOfYear": "$timestamp" },
        "max_temperature": { "$first": "$temperature" },
        "timestamp": { "$first": "$timestamp" }
    }},
    { "$sort": { "_id":1 } }
]

But as the initial sort order is actually misaligned with the grouping key, then it is likely to loose in overall efficiency.

Upvotes: 2

Related Questions