techalicious
techalicious

Reputation: 441

MongoDB incomplete search

I'm new to MongoDB but here goes:

Is there a way to do a form of partial search in MongoDB?

For example, if the document is something like { Name: Michael A. Johnson}, is there a way to format search so that it will return that document for a query of Name: Michael Johnson and vice versa?

Additionally, is there a method to search 'Name: Johnson' and return the document { Name: Michael A. Johnson}?

Thanks!

Upvotes: 0

Views: 811

Answers (1)

zamnuts
zamnuts

Reputation: 9582

As of MongoDB 2.6, $text (in conjunction with $search and $meta) can provide the search term functionality you describe.

Consider the following:

db.collection.ensureIndex({Name:'text'});
db.collection.find({
        $text: { $search: 'Michael Johnson' }
    },{
        score: { $meta: 'textScore' }
}).sort({
        score: {  $meta: 'textScore' }
});

Note that you do not need to ensureIndex all the time, the index will update as necessary. Also, all relevant indices will be used, so if you have multiple text-type indices, these will also be considered.

Per the documentation for $text:

$text performs a text search on the content of the fields indexed with a text index. $search (string) A string of terms that MongoDB parses and uses to query the text index. MongoDB performs a logical OR search of the terms unless specified as a phrase.

If you'd like to sort the result based on relevance (which is what is happening in the example above) use the meta textScore property via $meta (don't forget to duplicate in sort):

The $text operator assigns a score to each document that contains the search term in the indexed fields. The score represents the relevance of a document to a given text search query. The score can be part of a sort() method specification as well as part of the projection expression. The { $meta: "textScore" } expression provides information on the processing of the $text operation.


$text will not work on multiple fields individually. In this case, use $regex:

{ field: { $regex: '\bWORD\b', $options: 'i' } }

How to write regular expressions is out of scope. Do some searching on SO.

To mimic the behavior of $text where all "words" in the subject string are space-delimited "terms," you can create an array of regular expression objects by splitting on ' ' and mapping each term to a $regex object. If this is user-input then it is important to also escape all meta characters that could be considered part of the regular expression. Finally, construct an $or expression featuring all the subjects you want to search for, or alternatively an $and, $not, etc...

Here is a full example implementation with $or (logical OR):

var nameMongoSearch = strToMongoRegexArray('Michael Johnson','Name');
var almaMaterMongoSearch = strToMongoRegexArray('KU','AlmaMater');

// OR matching for both Name and AlmaMater terms
db.collection.find({
    $or: [].concat(nameMongoSearch).concat(almaMaterMongoSearch)
});

/*
 * When str = "Michael Johnson" and key = "Name"
 * convert to something like
 * [
 *   { Name: { $regex: '\\bMichael\\b', $options: 'i' } },
 *   { Name: { $regex: '\\bJohnson\\b', $options: 'i' } }
 * ]
 */
function strToMongoRegexArray(str,key) {
    // 
    return str
    .split(' ') // translate String to Array, split into "terms"
    .filter(Boolean) // filter empty strings (in the case of adjecent spaces)
    .map(function(str){ // translate each term into a mongodb regex
        var o = {};
        o[key] = {
            $regex: '\\b'+escapeRegExp(str)+'\\b', // the '\\b' encapsulation is for word boundaries
            $options: 'i' // the 'i' flag is for case insensitive matching
        };
        return o;
    });
}

/*
 * from https://stackoverflow.com/a/6969486/1481489
 * this will escape regex metacharacters for javascript for user input sanitation
 */
function escapeRegExp(str) {
  return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}

If you want to logical AND, this replacement snippet could work:

db.collection.find({
    $and: [
        { $or: nameMongoSearch},
        { $or: almaMaterMongoSearch}
    ]
});

Note: By convention, field names are usually camelcase and begin with a lowercase letter, i.e. the field is "almaMater" and not "Alma Mater" nor "AlmaMater". But to keep aligned with your original question, I am maintaining the cap on the first letter.

Upvotes: 2

Related Questions