Jens
Jens

Reputation: 127

Mongodb stemming text

I am looking into using the mongodbs built-in stemmer snowball for a project, as described here https://blog.codecentric.de/en/2013/01/text-search-mongodb-stemming/

I have not been able to find an example or a command where I can actually get the stemmed words.

Ex.

An record containing {txt: "I waited for hours"}

How can I get the stemmed version of txt returned? "I wait for hour"

Upvotes: 1

Views: 2217

Answers (2)

Anderson Wiese
Anderson Wiese

Reputation: 123

I don't know when this was introduced in mongo, but in v3.2.15, cursor.explain() on a text query will show the stemmed words in a field named "parsedTextQuery". In my current usage I find it at cursor.explain().queryPlanner.winningPlan.parsedTextQuery.terms.

db.foo.find({"$text":{$search:"robots constabulary synchronized \"true love\" -human -\"false promises\""}}).explain()

// queryPlanner.winningPlan...

        "parsedTextQuery" : {
            "terms" : [
                "constabulari",
                "love",
                "robot",
                "synchron",
                "true"
            ],
            "negatedTerms" : [
                "human"
            ],
            "phrases" : [
                "true love"
            ],
            "negatedPhrases" : [
                "false promises"
            ]
        },

Upvotes: 2

Stennie
Stennie

Reputation: 65313

Snowball is a commonly used open source stemming approach, with implementations/ports for many (if not most) programming languages.

If you only want stemming for your application, you should use the Snowball library directly.

MongoDB 2.4+ uses Snowball internally for text stemming & indexing, but does not provide a separate API to Snowball.

Upvotes: 2

Related Questions