want_to_be_calm
want_to_be_calm

Reputation: 1627

MongoDB get document of current date

I have a document like follows, now I want to group by channel, year and month, and day of month, then I would like to select the record of current date. Here is my document :

{
    "_id" : ObjectId("56de5eab29f9463baf6eb7d2"),
    "created_at" : ISODate("2016-03-08T05:10:03.854Z"),
    "updated_at" : ISODate("2016-03-08T05:10:03.854Z"),
    "channel" : "040403",
    "countrycode" : "IN",
    "distinctid" :       "abc123456",   
    "os" : "Android"
}

I have write an aggregate like this :

db.collection.aggregate([
    {$project: {              
              "channel":"$channel",
              "year": {$year: "$created_at" },
              "month: {$month: "$created_at" },
              "day": {$dayOfMonth: "$created_at" }
         },
         {$match: {
             "year": new Date().getFullYear(),
             "month": new Date().getMonth() + 1, 
             "day": new Date().getDate()     
         },         
         {$group:
             {{"_id":{"channel":"$channel",
                       year: {$year: "$created_at" },
                       month: {$month: "$created_at" },
                       day: {$dayOfMonth: "$created_at" }},
              "count":{"$sum":1}
             }
          }
    ]
)

However, it said

missing : after property id @(shell):1:108

Is that my idea correct and why the syntax is wrong? Thanks

Upvotes: 2

Views: 4861

Answers (2)

Blakes Seven
Blakes Seven

Reputation: 50406

Get rid of the intitial $project and use a "range" on the $match instead:

  // Get current start of day and start of tomorrow
  var now = Date.now(),
      oneDay = ( 1000 * 60 * 60 * 24 ),
      today = new Date( now - ( now % oneDay ) ),
      tomorrow = new Date( today.valueOf() + oneDay );

  db.collection.aggregate([
     // Match on range
     { "$match": {
         "$created_at": {
            "$gte": today, "$lt": tomorrow
         }
     }},
     // Then group on just the rolled up date
     { "$group": {
         "_id":{
             "channel":"$channel",
             "year": { "$year": "$created_at" },
             "month": { "$month": "$created_at" },
             "day": { "$dayOfMonth": "$created_at" }
          },
          "count":{"$sum":1}
     }}
  ])

Using a "range" with $gte and $lt is the most efficient option. If you try to extract components first with $project then this is actually a "full pass" through all the data to calculate those values. Only after all that work would anything be done to actually select the documents you want.

When $match is the very first stage, then all the filtering is done already, leaving less documents to process and a faster result. More importantly and as a "query", in the initial stage this is the only oppurtunity your processing has to use an "index" to speed things up. So best to take that opportunity.

Not that that the grouping per day is really doing anything for you when selecting only one day, but it's always the technique when you want a range of dates. And yes a "single" day is still a "range".


As for what is wrong with what you tried? Well lots of things, but most notably not using $match in the very first stage.

Other things you really need to remember is that stages like $project and $group "output" an altered representation of the document from the source. So what comes "out" is all that is available to the subsequent "pipeline" stage, and so on. Think "unix pipe" | with commands, and the best analogy to understand how each pipeline stage relates to the other. i.e:

ps -ef | grep mongo | tee output.txt

Upvotes: 1

Nicolas
Nicolas

Reputation: 2231

There are two problems with your query:

  1. You miss quite a few closing } for every stage.
  2. You are grouping by fields that are of Integer type (year, month, day), but you apply aggregation operators ($year, $month, $day) that expect Date fields.

Try this query instead:

db.collection.aggregate([{
    $project: {
        "channel": "$channel",
        "year": { $year: "$created_at" },
        "month": { $month: "$created_at" },
        "day": { $dayOfMonth: "$created_at" }
    }
}, {
    $match: {
        "year": new Date().getFullYear(),
        "month": new Date().getMonth() + 1,
        "day": new Date().getDate()
    }
}, {
    $group: {
        "_id": { "channel": "$channel", year: "$year", month: "$month", day: "$day" },
        "count": { "$sum": 1 }
    }
}]);

Upvotes: 2

Related Questions