Reputation: 23
My chalenge here is to create a autocomplete field (django and ES), where I could search "apeni", "rua apen" or "roa apen" and have got "rua apeninos" as the main (or unique) option. I have already tried suggest and completion in ES, but both use prefix (don't work with "apen"). I tried wildcards as well, but couldn't use fuzzy (don't work with "roa apeni" or "apini"). So, now I am tring match with fuzzy.
But even when query term is differente, like "rua ape" or "rua apot", it returns the same two docs with street_desc equal "rua apeninos" and "rua apotribu" and both with score 1.0.
Query:
{
"aggs":{
"addresses":{
"filters":{
"filters":{
"street":{
"match":{
"street_desc":{
"query":"rua ape",
"fuzziness":"AUTO",
"prefix_length":0,
"max_expansions":50
}
}
}
}
},
"aggs":{
"street_bucket":{
"significant_terms":{
"field":"street_desc.raw",
"size":3
}
}
}
}
},
"sort":[
{
"_score":{
"order":"desc"
}
}
]
}
Index:
{
"catalogs":{
"mappings":{
"properties":{
"street_desc":{
"type":"text",
"fields":{
"raw":{
"type":"keyword"
}
},
"analyzer":"suggest_analyzer"
}
}
}
}
}
Analyzer: (python)
suggest_analyzer = analyzer(
'suggest_analyzer',
tokenizer=tokenizer("lowercase"),
filter=[token_filter('stopbr', 'stop', stopwords="_brazilian_")],
language="brazilian",
char_filter=["html_strip"]
)
Upvotes: 1
Views: 30
Reputation: 32386
Adding an end to end working example, which I tested on all the given search terms.
Index-mapping
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
Index sample docs
{
"title" : "rua apotribu"
}
{
"title" : "rua apeninos"
}
Search queries
{
"query": {
"match": {
"title": {
"query": "apeni", //
"fuzziness":"AUTO"
}
}
}
}
And search result
"hits": [
{
"_index": "64881760",
"_type": "_doc",
"_id": "1",
"_score": 1.1026623,
"_source": {
"title": "rua apeninos"
}
}
]
Now with apen
also it gives search result
"hits": [
{
"_index": "64881760",
"_type": "_doc",
"_id": "1",
"_score": 2.517861,
"_source": {
"title": "rua apeninos"
}
}
]
And now when query terms are different like rua apot
, it brings both the docs with a much higher score to rua apotribu
as shown in below search result.
"hits": [
{
"_index": "64881760",
"_type": "_doc",
"_id": "2",
"_score": 2.9289336,
"_source": {
"title": "rua apotribu"
}
},
{
"_index": "64881760",
"_type": "_doc",
"_id": "1",
"_score": 0.41107285,
"_source": {
"title": "rua apeninos"
}
}
]
Upvotes: 1