Hugo
Hugo

Reputation: 1688

How can I improve performance on a MongoDB aggregation query?

I am using the following query to get the count of records per day where the air temperature is bellow 7.2 degree. The documentation recommends to use the aggregation framework since it is faster than the map reduce

db.maxial.aggregate([{
    $project: {
        time:1,
        temp:1,
        frio: {
            $cond: [
                { $lte: [ "$temp", 7.2 ] },
                0.25,
                0
            ]
        }
    }
}, {
    $match: {
        time: {
            $gte: new Date('11/01/2011'),
            $lt: new Date('11/03/2011')
        }
    }
}, {
    $group: {
        _id: {
            ord_date: {
                day: { $dayOfMonth: "$time" },
                month: { $month: "$time" },
                year: { $year: "$time" }
            }
        },
        horasFrio: { $sum: '$frio' }
    }
}, {
    $sort: {
        '_id.ord_date': 1
    }
}])

I get an average execution time of 2 secs. Am I doing something wrong? I am already using indexes on time and temp field.

Upvotes: 2

Views: 6969

Answers (2)

Abhi Das
Abhi Das

Reputation: 518

To improve the performance of an aggregate query you would have to use the various pipeline stages and in the right order. You can use the $match at first and later follow by $limit and $skip if needed. These all will shorten the number of records to be traversed for grouping and hence improves the performance.

Upvotes: -1

Neil Lunn
Neil Lunn

Reputation: 151072

You might have indexes defined but you are not using them. In order for an aggregation pipeline to "use" an index the $match stage must be implemented first. Also if you omit the $project entirely and just include this in $group you are doing it in the most efficient way.

db.maxial.aggregate( [
    { "$match": {
        "time": {
            "$gte": new Date('2011-11-01'),
            "$lt": new Date('2011-11-03')
        }
    }},
    { "$group": {
        "_id": {
           "day": { "$dayOfMonth": "$time" },
           "month": { "$month": "$time" },
           "year": { "$year": "$time" }
       },
       "horasFrio": {
          "$sum": { 
              "$cond": [{ "$lte": [ "$temp", 7.2 ] }, 0.25, 0 ]
          }
       }
    }},
    { "$sort": { "_id": 1} }
])

Project does not provide the benefits people think it does in terms of "reducing fields" in a direct way.

And beware JavaScript "Date" object constructors. Unless you issue in the right way you will get a locally converted date rather then the UTC time reference you should be issuing. That and other misconceptions are cleared up in the re-written listing.

Upvotes: 4

Related Questions