Yogendra
Yogendra

Reputation: 101

Custom sorting in elastic search

I have some documents in elastic search with completion suggester. I search for some value like Stack, the results are shown in the order given below:

  1. Stack Overflow
  2. Stack-Overflow
  3. Stack
  4. StackOver
  5. StackOverflow

I want the result to be displayed in the order:

  1. Stack
  2. StackOver
  3. StackOverflow
  4. Stack Overflow
  5. Stack-Overflow

i.e, the exacts matches should come first instead of results which space or special characters. TIA

Upvotes: 1

Views: 1603

Answers (1)

Ashish Goel
Ashish Goel

Reputation: 919

It all depends on the way you are analysing the string you are querying upon. I will suggest that you apply more than one analyser on the same string field. Below is an example of the mapping of the "name" field over which you want auto complete/suggester feature:

"name": {
    "type": "string",
    "analyzer": "keyword_analyzer",
    "fields": {
        "name_ac": {
            "type": "string",
            "index_analyzer": "string_autocomplete_analyzer",
            "search_analyzer": "keyword_analyzer"
        }
    }
}

Here, keyword_analyzer and string_autocomplete_analyzer are analysers defined in your index settings. Below is an example:

"keyword_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
    ],
    "tokenizer": "keyword"
}

"string_autocomplete_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "autocomplete"
    ],
    "tokenizer": "whitespace"
}

Here autocomplete is an analysis filter:

"autocomplete": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "10"
}

After having set this, when searching in Elasticsearch for the auto suggestions, you can make use of multiMatch queries instead of the normal match queries and here you provide boosts to individual fields in the multiMatch. Below is a example in java:

QueryBuilders.multiMatchQuery(yourSearchString,"name^3","name_ac");

You may need to alter the boost (^3) as per your needs.

If even this does not satisfy your requirements, you can look at having one more analyser which analyse the string based on first word and include that field in the multiMatch. Below is an example of such an analyser:

"first_word_name_analyzer": {
    "type": "custom",
    "filter": [
        "lowercase"
        ,
        "whitespace_merge"
        ,
        "edgengram"
    ],
    "tokenizer": "keyword"
}

With these analysis filters:

"whitespace_merge": {
    "pattern": "\s+",
    "type": "pattern_replace",
    "replacement": " "
},
"edgengram": {
    "type": "edgeNGram",
    "min_gram": "1",
    "max_gram": "32"
}

You may have to do some trials on the boost values in order to reach the most optimum results based on your requirements. Hope this helps.

Upvotes: 1

Related Questions