Reputation: 187
I am working out how to store my data in elasticsearch. First I tried the fuzzy function and while that worked okay I did not receive the expected results. Afterwards I tried the ngram
and then the edge_ngram
tokenizer. The edge_ngram
tokenizer looked like it works like an autocomplete. Exactly what I needed. But it still gives unexpected results. I configured min 1
and max 5
to get all results starting with the first letter I search for. While this works I still get those results as I continue typing.
Example: I have a name field filled with documents named The New York Times
and The Guardian
. Now when I search for T
both occur as expected. But the same happens when I search for TT
, TTT
and so on.
In that case it does not matter wether I execute the search in Kibana or from my application (which useses MultiMatch
on all fields). Kibana even shows me the that it matched the single letter T.
So what did I miss and how can I achieve getting the results like with an autocomplete but without having too many results?
Upvotes: 0
Views: 53
Reputation: 16192
When defining your index mapping, you need to specify search_analyzer
as standard. If no search_analyzer
is defined explicitly, then by default elasticsearch considers search_analyzer
to be the same as that of analyzer
specified.
Adding a working example with index data, mapping, search query and search result
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "autocomplete",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard" // note this
}
}
}
}
Index Data:
{
"name":"The Guardian"
}
{
"name":"The New York Times"
}
Search Query:
{
"query": {
"match": {
"name": "T"
}
}
}
Search Result:
"hits": [
{
"_index": "69027911",
"_type": "_doc",
"_id": "1",
"_score": 0.23092544,
"_source": {
"name": "The New York Times"
}
},
{
"_index": "69027911",
"_type": "_doc",
"_id": "2",
"_score": 0.20824991,
"_source": {
"name": "The Guardian"
}
}
]
Upvotes: 1