How to get sorted result start from a point in MongoDB?

Question

For example, I got some data in MongoDB

db: people

{_id:1, name:"Tom", age:26}
{_id:2, name:"Jim", age:22}
{_id:3, name:"Mac", age:22}
{_id:4, name:"Zoe", age:22}
{_id:5, name:"Ray", age:18}
....

If I want to get result sorted by "age", that's easy, just create a index "age" and use sort. Then I got a long list of return. I may got result like below:

{_id:5, name:"Ray", age:18}
{_id:2, name:"Jim", age:22}
{_id:3, name:"Mac", age:22}
{_id:4, name:"Zoe", age:22}
{_id:1, name:"Tom", age:26}
...

What if I only want this list also sorted by "age" and start from "Mac"? like below:

{_id:3, name:"Mac", age:22}
{_id:4, name:"Zoe", age:22}
{_id:1, name:"Tom", age:26}
...

I can't use $gte because this may include "Jim". Ages can be the same.

What the right way to query this? Thanks.

Neil Lunn · Accepted Answer

I think this is more of a "terminology" problem in that what you call a "start point" others call it something different. There are two things I see here as both what I would believe to be the "wrong" approach, and one I would think is the "right" approach to what you want to do. Both would give the desired result on this sample though. There is of course the "obvious" approach if that is simple enough for your needs as well.

For the "wrong" approach I would basically say to use $gte in both cases, for the "name" and "age". This basically gives you are "starting point" at "Mac":

db.collection.find(
    { "name": { "$gte": "Mac" }, "age": { "$gte": 22 } }
).sort({ "age": 1 })

But of course this would not work if you had "Alan" of age "27" since the name is less than the starting name value. Works on your sample though of course.

What I believe to the the "right" thing to what you are asking is that you are talking about "paging" data with a more efficient way that using .skip(). In this case what you want to do is "exclude" results in a similar way.

So this means essentially keeping the last "page" of documents seen, or possibly more depending on how much the "range" value changes, and excluding by the unique _id values. Best demonstrated as:

// First iteration
var cursor = db.collection.find({}).sort({ "age": 1 }).limit(2);

cursor.forEach(function(result) {
    seenIds.push(result._id);
    lastAge = result.age;
    // do other things
});

// Next iteration
var cursor = db.collection.find(
    { "_id": { "$nin": seenIds }, "age": { "$gte": lastAge } }
).sort({ "age": 1 }).limit(2);

Since in the first instance you had already "seen" the first two results, you submit the _id values as a $nin operation to exclude them and also ask for anything "greater than or equal to" the "last seen" age value.

That is an efficient way to "page" data in a forwards direction, and may indeed be what you are asking, but of course it requires that you know "which data" comes "before Mac" in order to do things properly. So that leaves the final "obvious" approach:

The most simple way to just start at "Mac" is to basically query the results and just "discard" anything before the results ever got to this desired value:

var startSeen = false;
db.collection.find(
    { "age": {"$gte": 22}}
).sort({ "age": 1 }).forEach(function(result) {
    if ( !startSeen )
        startSeen = ( result.name == 'Mac' );

    if ( startSeen ) {
        // Mac has been seen. Do something with your data
    }

})

At the end of the day, there is no way of course to "start from where 'Mac' appears in a a sorted list" in any arbitrary way. You are either going to :

Lexically remove any other results occurring before
Store results and page through them to "cut points" for last seen values
Just live with iterating the cursor and discarding results until the "first" desired match is found.

How to get sorted result start from a point in MongoDB?

Answers (2)

Related Questions