MonkeyBonkey
MonkeyBonkey

Reputation: 47911

Is there a way to optimize an auto-complete keyword query in mongo

I am creating an autocomplete service against my mongo db posts collection where a user can start to type the title of a post. It should return all posts with that keyword in the title and sort by a field called 'total'.

I have a field called "lower" that is a lowercase version of the title field we want to search, and that has an index set on it. Since I am looking for any keyword match, I am doing a regex search against lowerCaseTitle for words that appear anywhere in the title, not just in the beginning.

I looked at the execution plan and it looks like does a scan through every item (the full posts collection has 10061 items). I tried hinting at both the "lower_1" index and the "total_-1" index and they seem similar, but the total index seems better with a lower nscanned number if I set a limit number of 50 to the query. Is there anything I can do to optimize? I can't think of anything simple off the top of the head for this kind of full-text searching.

"cursor" : "BtreeCursor lower_1",
    "nscanned" : 10061,
    "nscannedObjects" : 2,
    "n" : 2,
    "scanAndOrder" : true,
    "millis" : 154,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {
        "lower" : [
            [
                "",
                {

                }
            ]
        ]
    }

Upvotes: 3

Views: 3762

Answers (1)

mnemosyn
mnemosyn

Reputation: 46331

I am doing a regex search against lowerCaseTitle for words that appear anywhere in the title, not just in the beginning.

From the documentation:

For simple prefix queries (also called rooted regexps) like /^prefix/, the database will use an index when available and appropriate (much like most SQL databases that use indexes for a LIKE 'prefix%' expression). This only works if you don't have i (case-insensitivity) in the flags.

In other words, no, MongoDB does not support searching substrings in a fast manner.

However, you can do the following to support words starting with a given string:

BlogPost { 
  Title : "This is furiously interesting post"
  TitleSearch : [ "this", "is", "furiously", "interesting", "post" ]
}

Now, with TitleSearch indexed and using a rooted regex, a search for 'inter' would return the sample, also for 'furious', but not for 'eresting'.

Upvotes: 6

Related Questions