Reputation: 15602
Trying to control the order that token filters are applied in ElasticSearch.
I know from the docs that the tokenizer is applied first, then the token filters, but they do not mention how the order of the token filters is determined.
Here's a YAML snippet from my analysis setup script:
KeywordNameIndexAnalyzer :
type : custom
tokenizer : whitespace
filter : [my_word_concatenator, keyword_ngram]
I would have thought that my_word_concatenator
would be applied before keyword_ngram
, but it seems like that isn't the case. Anyone know how (or if) the order of these filters can be controlled?
Thanks a lot!
Upvotes: 8
Views: 3084
Reputation: 60205
An analyzer is made of a tokenizer, which splits your text into tokens. After that token filters come into the picture, in the order you configured them, since you're providing an array. If you have doubts I'd suggest you to have a look at the analyze api, through which you can actually test how a analyzer works without indexing any text.
Upvotes: 8