Reputation: 1423
I've been doing well using Elasticsearch on "English" documents. However, I got stuck on prefix query when using "Korean" words.
In details, a document contains word such as "한글"
and I want to get the document using prefix query with search term not only "한"
but also "ㅎ"
.
I could not do that using default settings.
I saw that it's related to icu_normalizer
or nfd decomposition
or something else.
But I could not totally understand the way I have to do to get the result "한글" using "ㅎ" search term.
Is there anyone can help me?
Thanks in advance.
Upvotes: 1
Views: 370
Reputation: 56
Maybe this code helps you.
curl -XPUT '127.0.0.1:9200/test' -d '{
"settings" : {
"analysis": {
"tokenizer" : {
"autocomplete_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "1",
"max_gram" : "30",
"token_chars": ["letter", "digit"]
}
},
"char_filter" : {
"nfd_normalizer" : {
"type" : "icu_normalizer",
"name": "nfc",
"mode": "decompose"
}
},
"analyzer": {
"autocomplete_analyzer": {
"type": "custom",
"char_filter": ["nfd_normalizer"],
"tokenizer": "autocomplete_tokenizer"
}
}
}
}
}'
curl '127.0.0.1:9200/test/_analyze?pretty=1&analyzer=autocomplete_analyzer' -d '아버지가 방에 들어가신다. 태권-V'
Upvotes: 1