Reputation: 448
Running elastic version 1.6
I am trying to set custom analyzer for my index in elasticsearch. My index /has some properties which contains some accents and special characters.
Like one of my property name has value like this, "name" => "Está loca". So what I want to achieve is, whenever I am trying to search by this way, http://localhost:9200/tutorial/helloworld/_search?q=esta
I should get the result for "Está loca". I have gone through following link and configured necessary analyzer which is explain in the link. https://www.elastic.co/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html
curl -XPUT 'localhost:9200/tutorial?pretty' -H 'Content-Type: application/json' -d'
{
"mappings":{
"helloworld":{
"properties": {
"name": {
"type": "string",
"analyzer": "standard",
"fields": {
"folded": {
"type": "string",
"analyzer": "folding"
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"folding": {
"tokenizer": "standard",
"filter": [ "lowercase", "asciifolding" ]
}
}
}
}
}'
I have configured this while creating index and made some entries like this for test,
curl -X POST 'http://localhost:9200/tutorial/helloworld/1' -d '{ "name": "Está loca!" }'
curl -X POST 'http://localhost:9200/tutorial/helloworld/2' -d '{ "name": "Está locá!" }'
but while searching like this, http://localhost:9200/tutorial/helloworld/_search?q=esta nothing is happening. I just want whenever a user searches in any languages for example in English it should get the same result. Please anybody can help, how can I achieve this struggling on it for last 1 week.
Upvotes: 0
Views: 3021
Reputation: 448
This link also helped me a lot, gives exact analyzer for my scenario.
https://vanwilgenburg.wordpress.com/2013/08/03/diacritics-in-elasticsearch/
Upvotes: 0
Reputation: 4803
you would not be able to search for esta
keyword in _all
field. As elasticsearch by default only apply standard analyzer while constructing _all
field.
so your following query
GET folding_index1/helloworld/_search?q=esta
Produces following match query in elastic dsl.
GET folding_index1/helloworld/_search
{
"query": {
"match": {
"_all": "esta"
}
}
}
Which search against _all
field and hence couldn't find folded token for name.
You can do following, but even with include_in_all
mentioned for multifield, it still applies standard analyzer for _all field.
PUT folding_index1
{
"mappings": {
"helloworld": {
"properties": {
"name": {
"type": "string",
"analyzer": "standard",
"fields": {
"folded": {
"type": "string",
"analyzer": "folding",
"include_in_all": true
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"folding": {
"tokenizer": "standard",
"filter": ["lowercase", "asciifolding"]
}
}
}
}
}
Query like following can work for you. More on _all field analyzer
POST folding_index1/_search?q=name.folded:esta
Upvotes: 1