Reputation: 1627
I have a document like follows, now I want to group by channel, year and month, and day of month, then I would like to select the record of current date. Here is my document :
{
"_id" : ObjectId("56de5eab29f9463baf6eb7d2"),
"created_at" : ISODate("2016-03-08T05:10:03.854Z"),
"updated_at" : ISODate("2016-03-08T05:10:03.854Z"),
"channel" : "040403",
"countrycode" : "IN",
"distinctid" : "abc123456",
"os" : "Android"
}
I have write an aggregate like this :
db.collection.aggregate([
{$project: {
"channel":"$channel",
"year": {$year: "$created_at" },
"month: {$month: "$created_at" },
"day": {$dayOfMonth: "$created_at" }
},
{$match: {
"year": new Date().getFullYear(),
"month": new Date().getMonth() + 1,
"day": new Date().getDate()
},
{$group:
{{"_id":{"channel":"$channel",
year: {$year: "$created_at" },
month: {$month: "$created_at" },
day: {$dayOfMonth: "$created_at" }},
"count":{"$sum":1}
}
}
]
)
However, it said
missing : after property id @(shell):1:108
Is that my idea correct and why the syntax is wrong? Thanks
Upvotes: 2
Views: 4861
Reputation: 50406
Get rid of the intitial $project
and use a "range" on the $match
instead:
// Get current start of day and start of tomorrow
var now = Date.now(),
oneDay = ( 1000 * 60 * 60 * 24 ),
today = new Date( now - ( now % oneDay ) ),
tomorrow = new Date( today.valueOf() + oneDay );
db.collection.aggregate([
// Match on range
{ "$match": {
"$created_at": {
"$gte": today, "$lt": tomorrow
}
}},
// Then group on just the rolled up date
{ "$group": {
"_id":{
"channel":"$channel",
"year": { "$year": "$created_at" },
"month": { "$month": "$created_at" },
"day": { "$dayOfMonth": "$created_at" }
},
"count":{"$sum":1}
}}
])
Using a "range" with $gte
and $lt
is the most efficient option. If you try to extract components first with $project
then this is actually a "full pass" through all the data to calculate those values. Only after all that work would anything be done to actually select the documents you want.
When $match
is the very first stage, then all the filtering is done already, leaving less documents to process and a faster result. More importantly and as a "query", in the initial stage this is the only oppurtunity your processing has to use an "index" to speed things up. So best to take that opportunity.
Not that that the grouping per day is really doing anything for you when selecting only one day, but it's always the technique when you want a range of dates. And yes a "single" day is still a "range".
As for what is wrong with what you tried? Well lots of things, but most notably not using $match
in the very first stage.
Other things you really need to remember is that stages like $project
and $group
"output" an altered representation of the document from the source. So what comes "out" is all that is available to the subsequent "pipeline" stage, and so on. Think "unix pipe" |
with commands, and the best analogy to understand how each pipeline stage relates to the other. i.e:
ps -ef | grep mongo | tee output.txt
Upvotes: 1
Reputation: 2231
There are two problems with your query:
}
for every stage.Integer
type (year
, month
, day
), but you apply aggregation operators ($year
, $month
, $day
) that expect Date
fields.Try this query instead:
db.collection.aggregate([{
$project: {
"channel": "$channel",
"year": { $year: "$created_at" },
"month": { $month: "$created_at" },
"day": { $dayOfMonth: "$created_at" }
}
}, {
$match: {
"year": new Date().getFullYear(),
"month": new Date().getMonth() + 1,
"day": new Date().getDate()
}
}, {
$group: {
"_id": { "channel": "$channel", year: "$year", month: "$month", day: "$day" },
"count": { "$sum": 1 }
}
}]);
Upvotes: 2