Ashot
Ashot

Reputation: 255

MongoDb aggregate with limit and without limit

There is a collection in mongo In the collection of 40 million records

db.getCollection('feedposts').aggregate([

   {
        "$match": {
            "$or": [
                {
                  "isOfficial": true
                },
                {
                  "creator": ObjectId("537f267c984539401ff448d2"),
                   type: { $nin: ['challenge_answer', 'challenge_win']}

                }
            ],
        }
  },
  {
      $sort: {timeline: -1}
  }
])

This request never ends

But if you add a limit before sorting, and the limit is higher than the total number of records in advance, for example, 1,000,000,000,000,000 - the request will be processed instantly

db.getCollection('feedposts').aggregate([

   {
        "$match": {
            "$or": [
                {
                  "isOfficial": true
                },
                {
                  "creator": ObjectId("537f267c984539401ff448d2"),
                   type: { $nin: ['challenge_answer', 'challenge_win']}

                }
            ],
        }
  },
   {
      $limit: 10000000000000000
  },

  {
      $sort: {timeline: -1}
  }
])

Please tell me why this is happening? What problems can I expect in the future if I leave it this way?

Upvotes: 1

Views: 1627

Answers (1)

Tom Slabbaert
Tom Slabbaert

Reputation: 22276

TLDR: Mongo is using the wrong index for the query

Why is this happening?

Well basically every query you do Mongo simulates a quick "competition" between the relevant indexes in order to choose which one to use, the first index to retrieve 1001 documents "wins".

Now usually this situation of picking the wrong index occurs with ascending or descending fields and a matching index making this index with the fetching competition under certain conditions, Meaning this is very risky as you can have stable code that can suddenly become a huge bottleneck.

What can we do?

You have a few options:

  1. Use the hint option and make Mongo use the compound index you have ready for this pipeline.
  2. Drop the rogue index to ensure this will never happen again elsewhere (which is my recommended option).
  3. Keep doing what you're doing. basically by adding this random $limit stage you're throwing Mongo's competition off and ensuring the right index will be picked.

Upvotes: 4

Related Questions