ywbike
ywbike

Reputation: 1

Elasticsearch - How to highlight stop words only for exact phrase match?

Is it possible to highlight stop words which occur in a phrase, but not to highlight stop words which occur alone?

For example, I want to highlight "the lord of the rings", "rings", or "lord". But I don't want Elasticsearch to highlight "of" or "the" if they occur alone.

I am using english_stop analyzer in the index settings. This analyzer removes all stop words, therefore it does not highlight any stop words in the search results. But if I remove the english_stop analyzer, then it always highlight stop words like "of", "the" even when they occur alone. I can't add another field using english analyzer, because I have a lot of documents and reindexing is too costly.

Is there a way to highlight stop words only in a phrase match, without having to change the index schema?

My index template:

  "template": "index_name",
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "english": {
            "tokenizer": "tokenizer_name",
            "filter": [
              "standard",
              "lowercase",
              "english_stop",
              "kstem"
            ]
          },

This is the highlighted result for search query "The Lord of the Rings"

The Lord of the Rings is an epic high-fantasy novel written by English author J. R. R. Tolkien. The story began as a sequel to Tolkien's 1937 fantasy novel The Hobbit, but eventually developed into a much larger work. Written in stages between 1937 and 1949, The Lord of the Rings is one of the best-selling novels

Upvotes: 0

Views: 1180

Answers (1)

Hardik Dobariya
Hardik Dobariya

Reputation: 349

if you are using Query string then use phrase_slop property where it will only highlight rings and lord. We had the same issue and this solved it but this also has a limitation where ES will not able to differentiate between "Man in the moon" and "Man on the moon". it will highlight both man and moon occurrences because "in" and "on" both are stopwords.

https://www.elastic.co/guide/en/elasticsearch/guide/master/stopwords-phrases.html#_stopwords

Upvotes: 0

Related Questions