tarmes
tarmes

Reputation: 15442

Find all Mongo document with array containing all search terms

I have set of documents that contain an array of search terms, e.g.

[ "apples", "oranges", "bananas" ]

The user will enter a search string of keyword prefixes, and I'd like to match all the documents that contain each term in the array. So, for example, "app oranges" will match the list above, but "applet oranges" wouldn't.

It would be fairly trivial to construct a $and query that checked that each term matched one of the items in the array as a prefix using $regex, however that doesn't go far enough...

Each keyword should have a unique match within the set, such that searching "apples app" will not match the list above because the "app" term can't match against "apple" since "apple" has already been matched. This constraint leads to a more subtle problem. Take this set as an example:

[ "france", "fred", "freddy" ]

If the user taps "fr france" then this should match. It's important that the match for "fr" doesn't remove "france" from the possible list of terms for the remaining keywords, otherwise the test for the term "france" that follows would fail.

I need to implement this as a Mongo query. I'm quite new to Mongo and I have't a clue where to start, or even of this is possible. Can it be done? If so, how?

Upvotes: 1

Views: 324

Answers (1)

BatScream
BatScream

Reputation: 19700

To start with, you can use the $regex operator to match text patterns:

var searchTerms = "app oranges".split(" ");
var arr = [];
searchTerms.forEach(function(i){
var reg = new RegExp("^"+i);
arr.push({"names":{$regex:reg}});
})
db.collection.find({$and:arr});

Would give you the documents with array names containing values starting with app and containing oranges.

Each keyword should have a unique match within the set, such that searching "apples app" will not match the list above because the "app" term can't match against "apple" since "apple" has already been matched. This constraint leads to a more subtle problem. Take this set as an example:

This logic should be carried out in the application server before/after firing the query. If the user enters a string that is a substring of another former input, then the query is bound to fail since it would have already matched the fromer.

Upvotes: 1

Related Questions