Rohit Patwa
Rohit Patwa

Reputation: 1160

Stemming in elastic search replacing the original string

I used the following settings to create ES index.

"settings": {
    "analysis" : {
        "analyzer" : {
            "my_analyzer" : {
                "tokenizer" : "standard",
                "filter" : ["standard", "lowercase", "my_stemmer"]
            }
        },
        "filter" : {
            "my_stemmer" : {
                "type" : "stemmer",
                "name" : "english"
            }
        }
    }
}

I noticed that while analysing the stemmer replaces the original string with the stemmed word. Is there a way to index the original string and stemmed token both ?

Upvotes: 0

Views: 662

Answers (1)

Karsten R.
Karsten R.

Reputation: 1758

Your question is about a "preserve_original" parameter for stemmer token filter:

You will find "preserve_original" e.g. for Word Delimiter Token Filter but not for stemmer token filter.

If you need the original word e.g. for aggregation you can copy the field to another one with a suited analyzer.

If you need the original on the same position of your index you have to wrap the stemmer and build your own analyzer as plugin.

Upvotes: 2

Related Questions