Reputation: 2577
I am trying sort data by date first and then group on another field. It is not working to me.
The question I am trying to answer is: Select the most recent distinct cid?
given this data:
db.summary.save({"lid" : 5, "date" : 5, "cid" : 2, "circles" : [ 2 ] })
db.summary.save({"lid" : 2, "date" : 2, "cid" : 1, "circles" : [ 2 ] })
db.summary.save({"lid" : 4, "date" : 0, "cid" : 3, "circles" : [ 2 ] })
db.summary.save({"lid" : 3, "date" : 3, "cid" : 2, "circles" : [ 2 ] })
db.summary.save({"lid" : 1, "date" : 1, "cid" : 1, "circles" : [ 2 ] })
db.summary.aggregate( {$match :{circles: 2}, $sort: {date: -1}, $group: {_id: '$cid'}} )
I am doing a match first on circles, then a sort on date, then a group on cid
The result I am getting:
{
"result" : [
{
"_id" : 3
},
{
"_id" : 1
},
{
"_id" : 2
}
],
"ok" : 1
}
Here is my analysis:
before matching or sorting by date, the data was:
"lid" : 5, "date" : 5, "cid" : 2
"lid" : 2, "date" : 2, "cid" : 1
"lid" : 4, "date" : 0, "cid" : 3
"lid" : 3, "date" : 3, "cid" : 2
"lid" : 1, "date" : 1, "cid" : 1
After sorting by date, the data set would be:
"lid" : 5, "date" : 5, "cid" : 2
"lid" : 3, "date" : 3, "cid" : 2
"lid" : 2, "date" : 2, "cid" : 1
"lid" : 1, "date" : 1, "cid" : 1
"lid" : 4, "date" : 0, "cid" : 3
So after grouping, the result I expect is:
{
"result" : [
{
"_id" : 2
},
{
"_id" : 1
},
{
"_id" : 3
}
],
"ok" : 1
}
What query does solve my problem?
Why is the current query not working for me?
Upvotes: 24
Views: 13271
Reputation: 312129
When you $group
after a $sort
in the pipeline, the previous sort is lost. You'd have to do something like this instead so that the date you want to sort by is available after the grouping:
db.summary.aggregate(
{$match: {circles: 2}},
{$group: {_id: '$cid', date: {$max: '$date'}}},
{$sort: {date: -1}});
result:
[ { _id: 2, date: 5 },
{ _id: 1, date: 2 },
{ _id: 3, date: 0 } ]
Add a $project
to the end of the pipeline if you want to reshape the output.
Upvotes: 40