Reputation: 795
I am implementing this tutorial How to Speed-Up MongoDB Regex Queries by a Factor of up-to 10 and I am using the query specified at the end
db.movies.find({
$and:[{
$text: {
$search: "Moss Carrie-Anne"
}},{
cast: {
$elemMatch: {$regex: /Moss/, $regex: /Carrie-Anne/}}
}]}
);
The problem where I am stuck with is how do I generate the sub-query
$elemMatch: {$regex: /Moss/, $regex: /Carrie-Anne/}
programmatically with python
My code so far
def regexGen(s):
d={}
for word in s.split(" "):
d["$regex"]= "/"+word+"/" # this will of course save only the last value into the dict
return (d)
query= {
"$and":[{
"$text": {
"$search": "Moss Carrie-Anne"
}},{
"cast": {
"$elemMatch": regexGen("Moss Carrie-Anne")}
}
]
}
print (query)
#actual
# {'$and': [{'$text': {'$search': 'Moss Carrie-Anne'}}, {'cast': {'$elemMatch': {'$regex': '/Carrie-Anne/'}}}]}
#expected
# {'$and': [{'$text': {'$search': 'Moss Carrie-Anne'}}, {'cast': {'$elemMatch': {'$regex': '/Carrie-Anne/'}, {'$regex': '/Moss/'} }}]}
I am obviously missing something here, but not able to figure out
Upvotes: 2
Views: 397
Reputation: 627292
You may build a dynamic regex based on alternation:
{ "$regex" : "|".join([re.escape(word) for word in s.split()]) }
See the Python demo:
import re
s = "Moss Carrie-Anne"
print({ "$regex" : "|".join([re.escape(word) for word in s.split()]) })
# => {'$regex': 'Moss|Carrie\-Anne'}
Note that Moss|Carrie\-Anne
will match either Moss
or Carrie-Anne
. re.escape
will be helpful if you have (
, +
and other regex special chars in your literal input.
Upvotes: 1