Nandu Kalidindi
Nandu Kalidindi

Reputation: 6280

Fetch first N and last N records from MongoDB collection

I have a use case where I need to show top 10 and last 10 results from a group and sorted aggregation. I tried to use $limit but that will not let the next aggregators to work on the complete data.

db.collection.aggregate([groupAggregator, sortAggregator, { $limit: 10 }, /*only 10 records available*/] 

How to perform aggregations on the whole collection in the middle of the pipeline? I am using MongoDB 3.2.9. If that's impossible is there a way to UNION two aggregations the first one being the top 10 (ASC SORTED) and the second one last 10 (DESC SORTED),

Had it not been for the group aggregation I would have used the db.collection.find({}).sort().filter() strategy but the group needs to be done.

Data obtained from group aggregation

{_id: "", ..., avg_count: 10}
{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 5}
{_id: "", ..., avg_count: 8}
{_id: "", ..., avg_count: 3}
{_id: "", ..., avg_count: 4}
{_id: "", ..., avg_count: 6}
{_id: "", ..., avg_count: 7}
{_id: "", ..., avg_count: 9}

Data obtained from Sort aggregation

{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 3}
{_id: "", ..., avg_count: 4}
{_id: "", ..., avg_count: 5}
{_id: "", ..., avg_count: 6}
{_id: "", ..., avg_count: 7}
{_id: "", ..., avg_count: 8}
{_id: "", ..., avg_count: 9}
{_id: "", ..., avg_count: 10}

DESIRED OUTPUT:

FETCH FIRST 2 AND LAST 2 documents

{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 9}
{_id: "", ..., avg_count: 10}

NOTE: The above is just a sample data, actual data does not have exact serial numbers.

Upvotes: 1

Views: 3877

Answers (1)

dnickless
dnickless

Reputation: 10918

If understand you correctly, here's one way of getting that behaviour:

db.collection.aggregate([{
    $sort: { "your_sort_field": 1 } // sort the data
}, {
    $group: {
        _id: null, // group everything into one single bucket
        docs: { $push: "$$ROOT" } // push all documents into an array (this will be massive for huge collections...)
    }
}, {
    $project: {
        "docsTop10": { $slice: [ "$docs", 10 ] }, // take the first 10 elements from the ASC sorted array
        "docsBottom10": { $reverseArray: { $slice: [ "$docs", -10 ] } } // take the last 10 elements from the array but reverse their order
    }
}])

If you want to have everything in one single property you can simply use $concatArrays in the final stage:

$project: {
    "result": { $concatArrays: [ { $slice: [ "$docs", 10 ] }, { $reverseArray: { $slice: [ "$docs", -10 ] } } ] }
}

Unfortunately, there's no $replaceRoot yet in your version of MongoDB, otherwise you could flatten the results more nicely.

Also, since $reverseArray does not seem to be available in v3.2, either, you can simply drop that operator and $unwind and $sort one more time after the $project stage:

{
    $project: {
        _id: 0,
        "result": { $concatArrays: [ { $slice: [ "$docs", 10 ] }, { $slice: [ "$docs", -10 ] } ] }
    }
}, {
    $unwind: "$result"
}, {
    $sort: { "result.your_sort_field": 1 } // sort the data
}

Another option would be to use $facet (only from v3.4 onwards) which would certainly be way faster as MongoDB would be able to nicely optimize the sort/limit combination:

db.collection.aggregate([{
    $facet: { // start two separate pipeline
        "docsTop10": [
            { $sort: { "your_sort_field": 1 } }, // sort ASC
            { $limit: 10 } // take top 10
        ],
        "docsBottom10": [
            { $sort: { "your_sort_field": -1 } }, // sort DESC
            { $limit: 10 } // take top 10
        ]
    }
}])

Upvotes: 5

Related Questions