Reputation: 87
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edgeNGram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"suggest": {
"type": "completion",
"analyzer": "autocomplete"
},
"hostname": {
"type": "text"
}
}
}
}
} `
Above mapping is stored in Elastic search.
POST index/test
{
"hostname": "testing-01",
"suggest": [{"input": "testing-01"}]
}
POST index/test
{
"hostname": "testing-02",
"suggest": [{"input":"testing-02"}]
}
POST index/test
{
"hostname": "w1-testing-01",
"suggest": [{"input": "w1-testing-01"}]
}
POST index/test
{
"hostname": "w3-testing-01",
"suggest": [{"input": "w3-testing-01"}]
}
`
When there are 30 documents with hostname starting w1 and hostnames w3, when term "w3" is searched, I get suggestions of all w1 first and then w3.
Suggestion Query
{
"query": {
"_source": {
"include": [
"text"
]
},
"suggest": {
"server-suggest": {
"text": "w1",
"completion": {
"field": "suggest",
"size": 10
}
}
}
}
}
Tried different analyzers, same issue. can some body guide ?
Upvotes: 1
Views: 1866
Reputation: 217254
It's a common trap. It is because the min_ngram
is 1, and hence, both w1-testing-01
and w3-testing-01
will produce the token w
. Since you only specified analyzer
, the autocomplete
analyzer will also kick in at search time and hence searching suggestions for w3
will also produce the token w
, hence why both w1-testing-01
and w3-testing-01
match.
The solution is to add a search_analyzer
to your suggest
field so that the autocomplete
analyzer is not used at search time (you can use the standard
, keyword
or whatever analyzer makes sense for your use case), but only at indexing time.
"mappings": {
"test": {
"properties": {
"suggest": {
"type": "completion",
"analyzer": "autocomplete",
"search_analyzer": "standard" <-- add this
},
"hostname": {
"type": "text"
}
}
}
}
Upvotes: 3