SeungJun
SeungJun

Reputation: 91

Mongoose: how to use index in aggregate?

How can I use indexes in aggregate?

I saw the document https://docs.mongodb.com/manual/core/aggregation-pipeline/#pipeline-operators-and-indexes

The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline.

Is there any way of using index not the beginning situation?

like $sort, $match or $group

Please help me

Upvotes: 1

Views: 2231

Answers (1)

B. Fleming
B. Fleming

Reputation: 7220

An index works by keeping a record of certain pieces of data that point to a given record in your collection. Think of it like having a novel, and then having a sheet of paper that lists the names of various people or locations in that novel with the page numbers where they're mentioned.

Aggregation is like taking that novel and transforming the different pages into an entirely different stream of information. You don't know where the new information is located until the transformation actually happens, so you can't possibly have an index on that transformed information.

In other words, it's impossible to use an index in any aggregation pipeline stage that is not at the very beginning because that data will have been transformed and MongoDB has no way of knowing if it's even possible to efficiently make use of the newly transformed data.

If your aggregation pipeline is too large to handle efficiently, then you need to limit the size of your pipeline in some way such that you can handle it more efficiently. Ideally this would mean having a $match stage that sufficiently limits the documents to a reasonably-sized subset. This isn't always possible, however, so additional effort may be required.

One possibility is generating "summary" documents that are the result of aggregating all new data together, then performing your primary aggregation pipeline using only these summary documents. For example, if you have a log of transactions in your system that you wish to aggregate, then you could generate a daily summary of the quantities and types of the different transactions that have been logged for the day, along with any other additional data you would need. You would then limit your aggregation pipeline to only these daily summary documents and avoid using the normal transaction documents.

An actual solution is beyond the scope of this question, however. Just be aware that the index usage is a limitation that you cannot avoid.

Upvotes: 2

Related Questions