Reputation: 5
I have the following syntacs of data (dummy data):
news = [
{
"name" : "news1",
"url" : "https://news2.com/feed",
"datetime" : 1234567889,
"titles" : [
"Vivamus dapibus tortor ut quam interdum volutpat.",
"Quisque ut arcu a est hendrerit ullamcorper at nec sem.",
"Praesent dictum enim ut ultrices hendrerit.",
"Mauris sit amet dolor at turpis viverra mollis sit amet a elit.",
"Donec non eros in sapien luctus hendrerit quis sit amet nisi."
]
},
{
"name" : "news2",
"url" : "https://news2.com/feed",
"datetime" : 12345678,
"titles" : [
"Nullam at orci quis sem volutpat consectetur.",
"Proin finibus lorem at facilisis varius.",
"Aenean at erat a odio imperdiet volutpat in ac lorem.",
"Donecnon eros hendrerit quis sit amet nisi.",
"Curabitur dapibus risus nec vulputate maximus."
]
},
]
I have text index on each titles. I would like to write a query, aggregation, to find onyl those titles where the searched exact word exists. For example: searched word: 'Donec' and no need the 'Donecnon'.
I have tried regex and full text search as well:
db.collection.aggregate([
{ '$unwind' : "$titles"},
{
'$match': {
'titles': { '$regex': searchedword, '$options':'i' }
}
},
{ '$project': {
'_id': 0, 'titles': 1,
'name': 1,'datetime':1
}
},
{"$sort": {"datetime": -1}}
])
and:
db.power_of_words.aggregate([
{ '$match': { $text: { $search: "\"searchedword\"" }} },
{ '$unwind' : "$titles"},
{
'$match': {
'titles': /searchedword/
}
},
])
the result here everything:
db.collection.find({$text: {$search: "\"searchedword\""}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})
Nothing has worked, the result always contains the 'Donecnon' word too.
I would prefer the full text search because as far as I know it is much efficient and better in performance.
Upvotes: 0
Views: 59
Reputation: 3010
We cannot use regex in the text search.
MongoDB doc says:
text indexes can include any field whose value is a string or an array of string elements.
For more information please check https://docs.mongodb.com/manual/core/index-text/
Thus, we need to do it the following way:
db.collection.aggregate([
{
$unwind:"$titles"
},
{
$match:{
"titles":/\bDonec\b/i
}
}
]).pretty()
Sample output:
{
"name" : "news1",
"url" : "https://news2.com/feed",
"datetime" : 1234567889,
"titles" : "Donec non eros in sapien luctus hendrerit quis sit amet nisi."
}
Note: We are using '\b' to impose word boundry. This would help eliminating lines which doesn't contain the searched string as a complete word.
Upvotes: 1