Hosang Jeon
Hosang Jeon

Reputation: 1423

How can I use prefix query on Korean word in Elasticsearch?

I've been doing well using Elasticsearch on "English" documents. However, I got stuck on prefix query when using "Korean" words.

In details, a document contains word such as "한글" and I want to get the document using prefix query with search term not only "한" but also "ㅎ".

I could not do that using default settings. I saw that it's related to icu_normalizer or nfd decomposition or something else. But I could not totally understand the way I have to do to get the result "한글" using "ㅎ" search term.

Is there anyone can help me?

Thanks in advance.

Upvotes: 1

Views: 370

Answers (1)

JeongHoon Baek
JeongHoon Baek

Reputation: 56

Maybe this code helps you.

curl -XPUT '127.0.0.1:9200/test' -d '{
  "settings" : {
    "analysis": {
      "tokenizer" : {
        "autocomplete_tokenizer" : {
          "type" : "edgeNGram",
          "min_gram" : "1",
          "max_gram" : "30",
          "token_chars": ["letter", "digit"]
        }
      },
      "char_filter" : {
        "nfd_normalizer" : {
          "type" : "icu_normalizer",
          "name": "nfc",
          "mode": "decompose"
        }
      },
      "analyzer": {
        "autocomplete_analyzer": {
          "type": "custom",
          "char_filter": ["nfd_normalizer"],
          "tokenizer": "autocomplete_tokenizer"
        }
      }
    }
  }
}'

curl '127.0.0.1:9200/test/_analyze?pretty=1&analyzer=autocomplete_analyzer' -d '아버지가 방에 들어가신다. 태권-V'

Upvotes: 1

Related Questions