Control order of token filters in ElasticSearch

Question

Trying to control the order that token filters are applied in ElasticSearch.

I know from the docs that the tokenizer is applied first, then the token filters, but they do not mention how the order of the token filters is determined.

Here's a YAML snippet from my analysis setup script:

       KeywordNameIndexAnalyzer :
           type : custom
           tokenizer : whitespace
           filter : [my_word_concatenator, keyword_ngram]

I would have thought that my_word_concatenator would be applied before keyword_ngram, but it seems like that isn't the case. Anyone know how (or if) the order of these filters can be controlled?

Thanks a lot!

javanna · Accepted Answer

An analyzer is made of a tokenizer, which splits your text into tokens. After that token filters come into the picture, in the order you configured them, since you're providing an array. If you have doubts I'd suggest you to have a look at the analyze api, through which you can actually test how a analyzer works without indexing any text.

Control order of token filters in ElasticSearch

Answers (1)

Related Questions