Reputation: 53
I try to get positions instead of highlighted text as the result of elasticsearch query.
Create the index:
PUT /test/
{
"mappings": {
"article": {
"properties": {
"text": {
"type": "text",
"analyzer": "english"
},
"author": {
"type": "text"
}
}
}
}
}
Put a document:
PUT /test/article/1
{
"author": "Just Me",
"text": "This is just a simple test to demonstrate the audience the purpose of the question!"
}
Search the document:
GET /test/article/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"text": {
"query": "simple test",
"_name": "must"
}
}
}
],
"should": [
{
"match_phrase": {
"text": {
"query": "need help",
"_name": "first",
"slop": 2
}
}
},
{
"match_phrase": {
"text": {
"query": "purpose question",
"_name": "second",
"slop": 3
}
}
},
{
"match_phrase": {
"text": {
"query": "don't know anything",
"_name": "third"
}
}
}
],
"minimum_should_match": 1
}
},
"highlight": {
"fields": {
"text": {}
}
}
}
When i run this search, i get the result like so:
This is just a simple test to <em>demonstrate</em> the audience the purpose of the <em>question</em>!
I'm not interested in getting the results surrounded with em tags, but i want to get all the positions of the results like so:
"hits": [
{ "start_offset": 30, "end_offset": 40 },
{ "start_offset": 74, "end_offset": 81 }
]
Hope you get my idea!
Upvotes: 2
Views: 1217
Reputation: 3222
To have the offset position of a word in a text you should add to your index mapping a termvector
- doc here . As written in the doc, you have to enable this param at index time:
"term_vector": "with_positions_offsets_payloads"
For the specific query, please follow the linked doc page
Upvotes: 1