Joey Yi Zhao
Joey Yi Zhao

Reputation: 42418

Does tokenizer work for indexing or query or both in Elasticsearch?

I am looking at tokenizer in Elasticsearch 6.8. I know that it defines how we tokenize the text into words when it builds an index. As an example, It would convert a "Quick brown fox!" text into terms [Quick, brown, fox!]. If I have a field in Elasticsearch which has the text "Quick brown fox!", it will be broken into three words in the index. But what if I send a query text "Quick brown fox!", does tokenizer work for that query parameter as well?

Upvotes: 2

Views: 3012

Answers (2)

Amit
Amit

Reputation: 32376

You can also verify whether your search term is tokenizer or not, Also you can check the tokens generated from your search query. As it depends on various factors like the type of query, analyzed vs non-analyzed queries.

Match queries are an example of an analyzed query where the provided text is analyzed before matching. while term query is an example of a non-analyzed query, where the provided search text is not analyzed and sent as it is for searching.

For checking the tokens generated by search query use Explain API, which Returns information about why a specific document matches (or doesn’t match) a query. In the output of this query, you will be able to check the tokens generated for your search terms.

Below is the sample snippet from the output for explain API, which shows the search term tokens generated at Elasticsearch based on various factors.

"description": "weight(to:Foo in 0) [PerFieldSimilarity], result of:",

This API is the fastest way to check the final tokens generated by ES, which are used for a token to token match.

Upvotes: 2

Val
Val

Reputation: 217254

Analyzers do work both at indexing time and query time provided they are correctly configured in the field mappings of your index.

On this page, you get a complete description of when an analyzer kicks in, repeated below for clarity:

At index time, Elasticsearch will look for an analyzer in this order:

  • The analyzer defined in the field mapping.
  • An analyzer named default in the index settings.
  • The standard analyzer.

At query time, there are a few more layers:

  • The analyzer defined in a full-text query.
  • The search_analyzer defined in the field mapping.
  • The analyzer defined in the field mapping.
  • An analyzer named default_search in the index settings.
  • An analyzer named default in the index settings.
  • The standard analyzer.

So as you can see, an analyzer can be leveraged both when you ingest data and also when you query it.

Upvotes: 3

Related Questions