Guerrilla
Guerrilla

Reputation: 14866

Tackling MongoDB single text index limit

I have a collection of around 200m documents. I need to build a search for it that looks for substrings. Using a regex search is incredibly slow even with a regular index on the field being searched for.

The answer seems to be a text index but there is only one text index allowed per collection. I can make the text index search multiple fields but that will actually break intended functionality as it will make results inaccurate. I need to specify the exact fields the substring should appear in.

Are there any ways around this limitation? The documentation says their cloud databases allow multiple indexes but for this project I need to keep data on our own servers.

Upvotes: 3

Views: 664

Answers (1)

Sihoon Kim
Sihoon Kim

Reputation: 1799

Yea even if you index your field, it will still go for a collection scan if you use regex search. And you can only have text index on a single field. Also this text index is based on words not sub-strings, so text index would not do anything.

These indices including text index is basically pre sorting the documents according to the indexed field in alphabetical order(or reverse). For text field, it is very similar, but a little better because it indexes each word of the selected field. But in your case since you are searching for substrings, text index would be equally useless.

To solve your problem, typically you would have to go for another dedicated database such as ElasticSearch.

Fortunately, MongoDB Atlas released Atlas search index recently and it should solve your problem. You can index multiple(or all) fields and it can also search sub-strings. Its basically a "search engine". Just like ElasticSearch, it is based on the popular open source search engine, Lucene. After you apply Atlas search index you can use aggregate with $search pipeline.

But in order to use this feature, you need to use MongoDB Atlas. As far as I know you can only create this search index in MongoDB Atlas. Once you have MongoDB Atlas setup, applying and using this search feature is straight forward. You can go to MongoDB Atlas, then to your collection and apply this search index with few clicks. You can fine tune it(check the docs) but you can start with the default settings.

Using it in your backend is very simple(from docs):

db.articles.aggregate(
   [
     { $match: { $text: { $search: "cake" } } },
     { $group: { _id: null, views: { $sum: "$views" } } }
   ]
)

Upvotes: 1

Related Questions