aokozlov
aokozlov

Reputation: 731

Mongodb aggregate distinct with sort and limit

I have a collection objects.

{
    "_id" : ObjectId("55fa65046db58e7d0c8b456a"),
    "object_id" : "1651419",
    "user" : {
        "id" : "65593",
        "cookie" : "9jgkm7ME1HDFD4K6j8WWvg",
    },
    "createddate" : ISODate("2015-09-17T10:00:20.945+03:00")
}

Every time user visits object's page it stores as separate record in collection. Now i need to get array of last N visited objects. It should be distinct, so array should have N unique records. Also, it should be sorted by createddate. So if the user visited object_id = 1, then object_id = 2 two times, after that visited object_id = 3 and again object_id = 1 the array should contain:

{
    visits : [1, 3, 2]
}

(distinct and sorted by time of last visit).

I tried to use construction like

db.objects.aggregate([
    {$match: {'user.id' : '65593'}},
    {$sort: { 'createddate':-1 }},
    {$project: {'id': '$user.id', 'obj' : '$object_id'}},
    {$group: {_id:'$id', 'obj': {$addToSet: '$obj'}}},
    {$project:{_id:0, 'obj':'$obj'}}
])

but it returns array that not sorted and also i can't limit array size.

Upvotes: 1

Views: 5359

Answers (2)

Blakes Seven
Blakes Seven

Reputation: 50406

The $addToSet operator and "sets" in general for MongoDB are not ordered in any way. Insead, get the "distinct" values by grouping on them first, then apply to the array after sorting them:

db.objects.aggregate([
    { "$match": { "user.id": "65593" } },
    { "$sort": { "user.id": 1, "createddate": -1 } },
    { "$group": {
        "_id": {
            "_id": "$user.id", 
            "object_id": "$object_id"
        },
        "createddate": { "$first": "$createddate" }
    }},
    { "$sort": { "_id._id": 1, "createddate": -1 } },
    { "$group": {
        "_id": "$_id._id",
        "obj": { "$push": "$_id.object_id" }        
    }}
])

So if you want the discovery oder by date you $sort first, but since $group does not guarantee any order of results you need to $sort again before you group with the $push operation to build the array.

Note that you are likely reducing down the "createddate" somehow as then general "distinct" items would appear to be the "user.id" and the "object_id" fields, so this does need some sort of accumulator and needs to be included for your ordering.

Then the array items will be in the order you expect.

If you need to $limit then you must process $unwind and split the limit the results. Alternately process a "limit" after the first group and following sort here.

But of course this is only practical to do for a single main grouping _id, being "user.id". Future mongodb releases will support $slice, which will make this practical for multiple grouping id's and a bit more simple in general. But it still won't be possible to "limit" the array items before that initial group over multiple primary groupind id's.

Upvotes: 5

aokozlov
aokozlov

Reputation: 731

I found the solution i expected.

db.objects.aggregate([
  {$match: {'user.id' : '65593'}},
  {$group : { 
      _id : '$object_id',
      dt : {$max: '$createddate'}
      }
  },
  {$sort: {'dt':-1}},
  {$limit:5},
  {$group : { 
      _id :null,
      'objects' : {$push:'$_id'}
      }
  },
  {$project: {_id:0, 'objects':'$objects'}}
])

It returns limited to N distinct array sorted backwards by createddate. Thank everyone for help!

Upvotes: 3

Related Questions