Reputation: 1379
Am newby to ELK 5.1.1 stack and I have a few questions just for my understanding.
I have setup this stack basicaly with standard analyzers / filters and everything works great.
My data source is a MySQL backend that I index using Logstash.
I would like to deal with queries containing accents and hopefully asciifolding
token filter can help achieve this.
First I learned out how to create custom analyzer and save as template.
Right now when I query this url http://localhost:9200/_template?pretty
I have 2 templates: the logstash default template named logstash
and my custom template which settings are:
"custom_template" : {
"order" : 1,
"template" : "doo*",
"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"myCustomAnalyzer" : {
"filter" : [
"standard",
"lowercase",
"asciifolding"
],
"tokenizer" : "standard"
}
}
},
"refresh_interval" : "5s"
}
},
"mappings" : { },
"aliases" : { }
}
Searching for the keyword Yaoundé
returns 70 hits but when I search for Yaounde
I keep having no hit.
Below is my query for the second case
{
"query": {
"query_string": {
"query": "yaounde",
"fields": [
"title"
]
}
},
"from": 0,
"size": 10
}
Please can somebody help me guess what am doing wrong here?
Also knowing that my data is analyzed by Logstash during the index process do I really have to specify that the analyzer myCustomAnalyzer
should be applied during the research as per this second query ?
{
"query": {
"query_string": {
"query": "yaounde",
"fields": [
"title"
],
"analyzer": "myCustomAnalyzer"
}
},
"from": 0,
"size": 10
}
Here is a sample of the output part of my logstash config file
output {
stdout { codec => json_lines }
if [type] == "announces" {
elasticsearch {
hosts => "localhost:9200"
document_id => "%{job_id}"
index => "dooone"
document_type => "%{type}"
}
} else {
elasticsearch {
hosts => "localhost:9200"
document_id => "%{uid}"
index => "dootwo"
document_type => "%{type}"
}
}
}
Thank You
Upvotes: 0
Views: 324
Reputation: 4733
A good place to start is with the index template documentation of elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
An example for your scenario that could work for the title field:
"custom_template" : {
"order" : 1,
"template" : "doo*",
"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"myCustomAnalyzer" : {
"filter" : [
"standard",
"lowercase",
"asciifolding"
],
"tokenizer" : "standard"
}
}
},
"refresh_interval" : "5s"
}
},
"mappings" : {
"your_type": {
"properties": {
"title": {
"type": "text",
"analyzer": "myCustomAnalyzer"
}
}
}
},
"aliases" : { }
}
An alternative would be to change the dynamic mapping. You can find a good example right here for strings. https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html#match-mapping-type
Upvotes: 1
Reputation: 99
Can you show the mapping of your document ?
(GET /my_index/my_doc/_mapping )
The analyser you provide as argument in your query does only apply at search time, not at indexation time. So if you haven't set this analyser in your mapping, the string is still indexed with "default" analyzer, so it will not match your results.
The analyzer you provide at search time will apply on your query string, but then it will look into indexed data, which is indexed as "Yaoundé", not "yaounde".
Upvotes: 0