Аушев
Аушев

Reputation: 45

Search speed in Elastic does not depend on the use of the analyzer

I am trying to use elastic search in my project. Created an index

curl --location --request PUT 'http://localhost:9200/customers' --header 'Content-Type: application/json' --data-raw '{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 40,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "index": {
        "max_ngram_diff" : "37"
    }
  }
}'

Added 200 thousand records, I search for all fields in the query

curl --location --request GET 'http://localhost:9200/customers/_search?q=*2000018*' --header 'Content-Type: application/json'

And the bottom line is that if I create an index without a parser, then the speed remains the same, I used ngram in the first case, and, in principle, was pleased with the speed of work, but decided to make sure that this is really due to the fact that it is configured correctly, but without it the situation is similar, it turns out that it does not affect in any way and that means I have something- what i do wrong

I tried to find the answer in the documentation and similar questions, but still can't figure out what I am doing wrong

I will be glad for any help

Upvotes: 3

Views: 99

Answers (1)

Bhavya
Bhavya

Reputation: 16192

It looks like that you have defined the analyzer ONLY in the settings part.

You need to define the analyzer in the mappings part also.

You need to add the analyzer setting that will point to the my_analyzer analyzer that will be used at the index time.

Refer to this official ES documentation, to know more about analyzer


Suppose the field on which you want to use the ngram is title. Below will be the modified index mapping for that case -

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 40,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "index": {
      "max_ngram_diff": "37"
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

Upvotes: 1

Related Questions