Tech_Lover
Tech_Lover

Reputation: 87

Completion Suggester Not working as expected

{
"settings": {
    "analysis": {
        "filter": {
            "autocomplete_filter": {
                "type": "edgeNGram",
                "min_gram": 1,
                "max_gram": 20
            }
        },
        "analyzer": {
            "autocomplete": {
                "type": "custom",
                "tokenizer": "whitespace",
                "filter": [
                    "lowercase",
                    "autocomplete_filter"
                ]
            }
        }
    }
},
"mappings": {
    "test": {
        "properties": {
            "suggest": {
                "type": "completion",
                "analyzer": "autocomplete"
            },
            "hostname": {
                "type": "text"
            }
        }
    }
}

} `

Above mapping is stored in Elastic search.

POST index/test { "hostname": "testing-01", "suggest": [{"input": "testing-01"}] } POST index/test { "hostname": "testing-02", "suggest": [{"input":"testing-02"}] } POST index/test { "hostname": "w1-testing-01", "suggest": [{"input": "w1-testing-01"}] } POST index/test { "hostname": "w3-testing-01", "suggest": [{"input": "w3-testing-01"}] } ` When there are 30 documents with hostname starting w1 and hostnames w3, when term "w3" is searched, I get suggestions of all w1 first and then w3. Suggestion Query

{
"query": {
    "_source": {
        "include": [
            "text"
        ]
    },
    "suggest": {
        "server-suggest": {
            "text": "w1",
            "completion": {
                "field": "suggest",
                "size": 10
            }
        }
    }
}

}

Tried different analyzers, same issue. can some body guide ?

Upvotes: 1

Views: 1866

Answers (1)

Val
Val

Reputation: 217254

It's a common trap. It is because the min_ngram is 1, and hence, both w1-testing-01 and w3-testing-01 will produce the token w. Since you only specified analyzer, the autocomplete analyzer will also kick in at search time and hence searching suggestions for w3 will also produce the token w, hence why both w1-testing-01 and w3-testing-01 match.

The solution is to add a search_analyzer to your suggest field so that the autocomplete analyzer is not used at search time (you can use the standard, keyword or whatever analyzer makes sense for your use case), but only at indexing time.

  "mappings": {
    "test": {
      "properties": {
        "suggest": {
          "type": "completion",
          "analyzer": "autocomplete",
          "search_analyzer": "standard"        <-- add this
        },
        "hostname": {
          "type": "text"
        }
      }
    }
  }

Upvotes: 3

Related Questions