zoltanpal
zoltanpal

Reputation: 5

Need help in Full Text Search in mongoDB

I have the following syntacs of data (dummy data):

news = [
    {
    "name" : "news1",
    "url" : "https://news2.com/feed",
    "datetime" : 1234567889,
    "titles" : [ 
        "Vivamus dapibus tortor ut quam interdum volutpat.", 
        "Quisque ut arcu a est hendrerit ullamcorper at nec sem.", 
        "Praesent dictum enim ut ultrices hendrerit.", 
        "Mauris sit amet dolor at turpis viverra mollis sit amet a elit.", 
        "Donec non eros in sapien luctus hendrerit quis sit amet nisi."
    ]
    },
    {
        "name" : "news2",
        "url" : "https://news2.com/feed",
        "datetime" : 12345678,
        "titles" : [ 
            "Nullam at orci quis sem volutpat consectetur.", 
            "Proin finibus lorem at facilisis varius.", 
            "Aenean at erat a odio imperdiet volutpat in ac lorem.",
            "Donecnon eros hendrerit quis sit amet nisi.",
            "Curabitur dapibus risus nec vulputate maximus."
        ]
    },        
]

I have text index on each titles. I would like to write a query, aggregation, to find onyl those titles where the searched exact word exists. For example: searched word: 'Donec' and no need the 'Donecnon'.

I have tried regex and full text search as well:

db.collection.aggregate([
    { '$unwind' : "$titles"}, 
    {
        '$match': {
                'titles': { '$regex':  searchedword, '$options':'i' }
            }
    },
    { '$project': {
            '_id': 0, 'titles': 1,
            'name': 1,'datetime':1
        }
    },
    {"$sort": {"datetime": -1}}
])

and:

db.power_of_words.aggregate([
    { '$match': { $text: { $search: "\"searchedword\"" }} },
    { '$unwind' : "$titles"},
    {
        '$match': {
                'titles': /searchedword/
            }
    },
])

the result here everything:

db.collection.find({$text: {$search: "\"searchedword\""}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})

Nothing has worked, the result always contains the 'Donecnon' word too.

I would prefer the full text search because as far as I know it is much efficient and better in performance.

Upvotes: 0

Views: 59

Answers (1)

Himanshu Sharma
Himanshu Sharma

Reputation: 3010

We cannot use regex in the text search.

MongoDB doc says:

text indexes can include any field whose value is a string or an array of string elements.

For more information please check https://docs.mongodb.com/manual/core/index-text/

Thus, we need to do it the following way:

db.collection.aggregate([
    {
        $unwind:"$titles"
    },
    {
        $match:{
            "titles":/\bDonec\b/i
        }
    }
]).pretty()

Sample output:

{
    "name" : "news1",
    "url" : "https://news2.com/feed",
    "datetime" : 1234567889,
    "titles" : "Donec non eros in sapien luctus hendrerit quis sit amet nisi."
}

Note: We are using '\b' to impose word boundry. This would help eliminating lines which doesn't contain the searched string as a complete word.

Upvotes: 1

Related Questions