Reputation: 1568
I am using the elasticsearch module in my nodejs app to query my index using fuzzy completion. The text I'm trying to search is Rome–Fiumicino Leonardo da Vinci International Airport
. when searching this term I get no results, but if I cut the term to 50 characters it does find it and return results.
const result = await elasticsearch.search({
index: 'myIndex',
body: {
suggest: {
fuzzinessZero: {
text,
completion: {
field: 'name_suggest',
fuzzy: {
fuzziness: 0,
},
contexts,
},
},
fuzzinessOne: {
text,
completion: {
field: 'name_suggest',
fuzzy: {
fuzziness: 1,
},
contexts,
},
},
fuzzinessTwo: {
text,
completion: {
field: 'name_suggest',
fuzzy: {
fuzziness: 2,
},
contexts,
},
},
},
}
})
This is the result I get in fuzzinessOne
As you can see, the result in the
text
field is cut to 50 characters (maybe that's the issue). And inside the _source
I get back all the inputs which is used for the search, and one of them is the full exact term which I tried to search, as well with all the other available combinations available.
It is worth mentioning that I'm using AWS openSearch. And this is the settings which I use to create the index:
settings: {
analysis: {
filter: {
autocomplete_filter: {
type: 'edge_ngram',
min_gram: 2,
max_gram: 20,
},
shingle_filter: {
type: 'shingle',
max_shingle_size: 3,
},
},
analyzer: {
autocomplete: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'shingle_filter', 'asciifolding'],
},
},
},
}
Upvotes: 0
Views: 258
Reputation: 5486
You are facing this issue because of default value of max_input_length
parameter is set to 50
.
Below is description given for this parameter in documentation:
Limits the length of a single input, defaults to 50 UTF-16 code points. This limit is only used at index time to reduce the total number of characters per input string in order to prevent massive inputs from bloating the underlying datastructure. Most use cases won’t be influenced by the default value since prefix completions seldom grow beyond prefixes longer than a handful of characters.
You can use this default behaviour or you can updated your index mapping with increase value of max_input_length
parameter and reindex your data.
{
"mappings": {
"dynamic": "false",
"properties": {
"namesuggest": {
"type": "completion",
"analyzer": "keyword_lowercase_analyzer",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 100,
"contexts": [
{
"name": "searchable",
"type": "CATEGORY"
}
]
}
}
},
"settings": {
"index": {
"mapping": {
"ignore_malformed": "true"
},
"refresh_interval": "5s",
"analysis": {
"analyzer": {
"keyword_lowercase_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "0",
"number_of_shards": "1"
}
}
}
Upvotes: 1