Stefan
Stefan

Reputation: 53

Position as result, instead of highlighting

I try to get positions instead of highlighted text as the result of elasticsearch query.

Create the index:

PUT /test/
{
  "mappings": {
    "article": {
      "properties": {
        "text": {
          "type": "text",
          "analyzer": "english"
        },
        "author": {
          "type": "text"
        }
      }
    }
  }
}

Put a document:

PUT /test/article/1
{
  "author": "Just Me",
  "text": "This is just a simple test to demonstrate the audience the purpose of the question!"
}

Search the document:

GET /test/article/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "text": {
              "query": "simple test",
              "_name": "must"
            }
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "text": {
              "query": "need help",
              "_name": "first",
              "slop": 2
            }
          }
        },
        {
          "match_phrase": {
            "text": {
              "query": "purpose question",
              "_name": "second",
              "slop": 3
            }
          }
        },
        {
          "match_phrase": {
            "text": {
              "query": "don't know anything",
              "_name": "third"
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "fields": {
      "text": {}
    }
  }
}

When i run this search, i get the result like so:

This is just a simple test to <em>demonstrate</em> the audience the purpose of the <em>question</em>!

I'm not interested in getting the results surrounded with em tags, but i want to get all the positions of the results like so:

"hits": [
   { "start_offset": 30, "end_offset": 40 },
   { "start_offset": 74, "end_offset": 81 }
]

Hope you get my idea!

Upvotes: 2

Views: 1217

Answers (1)

Lupanoide
Lupanoide

Reputation: 3222

To have the offset position of a word in a text you should add to your index mapping a termvector - doc here . As written in the doc, you have to enable this param at index time:

"term_vector": "with_positions_offsets_payloads"

For the specific query, please follow the linked doc page

Upvotes: 1

Related Questions