Reputation: 127
I am looking into using the mongodbs built-in stemmer snowball for a project, as described here https://blog.codecentric.de/en/2013/01/text-search-mongodb-stemming/
I have not been able to find an example or a command where I can actually get the stemmed words.
Ex.
An record containing {txt: "I waited for hours"}
How can I get the stemmed version of txt returned? "I wait for hour"
Upvotes: 1
Views: 2217
Reputation: 123
I don't know when this was introduced in mongo, but in v3.2.15, cursor.explain() on a text query will show the stemmed words in a field named "parsedTextQuery". In my current usage I find it at cursor.explain().queryPlanner.winningPlan.parsedTextQuery.terms.
db.foo.find({"$text":{$search:"robots constabulary synchronized \"true love\" -human -\"false promises\""}}).explain()
// queryPlanner.winningPlan...
"parsedTextQuery" : {
"terms" : [
"constabulari",
"love",
"robot",
"synchron",
"true"
],
"negatedTerms" : [
"human"
],
"phrases" : [
"true love"
],
"negatedPhrases" : [
"false promises"
]
},
Upvotes: 2
Reputation: 65313
Snowball is a commonly used open source stemming approach, with implementations/ports for many (if not most) programming languages.
If you only want stemming for your application, you should use the Snowball library directly.
MongoDB 2.4+ uses Snowball internally for text stemming & indexing, but does not provide a separate API to Snowball.
Upvotes: 2