Čamo
Čamo

Reputation: 4160

Elasticsearch search result relevance issue

Why does match query return less relevant results first? I have an index field named normalized. Its mapping is:

normalized: {
    type: "text"
    analyzer: "autocomplete"
}

settings for this field are:

analysis; {
    filter: {
        autocomplete_filter: {
            type: "edge_ngram",
            min_gram => "1",
            max_gram => "20"
        }
    analyzer: {
        autocomplete: {
            filter: [
                "lowercase",
                "asciifolding",
                "autocomplete_filter"
            ],
            type: "custom",
            tokenizer: "standard"
        }
    }

so as I know it makes an ascii, lowercase, tokens e.g. MOUSE = m, mo, mou, mous, mouse. The problem is that request like:

{
    'query': {
        'bool': {
            'must': {
                'match': {
                    'normalized': 'simag'
                }
             }
         }
     }
 }

returns results like

  1. "siman siman service"
  2. "mgr simona simunkova simiki"
  3. "Siman - SIMANS"
  4. "simunek simunek a simunek"
  5. .....

But there is no SIMAG which contains all the letters of the match phrase. How to achieve that most relevant result will be the words which contains all the letters before the tokens which does not contain all letters. Hope somebody understand what I need. Thanks.

PS: I am not sure but what about this query:

{
    'query': {
        'bool': {
            'should': [
                {'term': {'normalized': 'simag'}},
                {'match': {'normalized': 'simag'}}
             ]
         }
     }
 }

Does it make sense in comparison to previous code?

Upvotes: 1

Views: 168

Answers (1)

Amit
Amit

Reputation: 32376

Please note that match query is analyzed, which means the same analyzer is used at the query time, which was used at the index time for the field you mentioned in your query.

In your case, you applied autocomplete analyzer on your normalized field and as you mentioned, it generates below token for MOUSE :

MOUSE = m, mo, mou, mous, mouse.

In similar way, if you search for mouse using the match query on the same field, it would search for below query strings :-

m, mo, mou, mous, mouse .. hence results which contain the words like mousee or mouser will also come as during index .. it created tokens which matches with the tokens generated on the search term.

Read more about match query on Elastic site https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html first line itself explains your search results

match queries accept text/numerics/dates, analyzes them, and constructs a query:

If you want to go deep and understand, how your search query is matching the documents and its score use explain API

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html

Upvotes: 1

Related Questions