Reizar
Reizar

Reputation: 241

ElasticSearch: Labelling documents with matching search term

I'm using elasticsearch 1.7 and am in need of a way to label documents with what part of a query_string query they match.

I've been experimenting with highlighting, but found that it gets a bit messy with some cases. I'd love to have the document tagged with matching search terms.

Here is the query that I'm using: ( note this is a ruby hash that later gets encoded to JSON )

{
  query: {
    query_string: {
      fields: ["title^10", "keywords^4", "content"],
      query: query_string,
      use_dis_max: false
    }
  },
  size: 20,
  from: 0,
  sort: [
    { pub_date: { order: :desc }},
    { _score:   { order: :desc }}
  ]
}

The query_string variable is based off user followed topics and might look something like this: "(the AND walking AND dead) OR (iphone) OR (video AND games)"

Is there any option I can use so that documents returned would have a property matching a search term like the walking dead or (the AND walking AND dead)

Upvotes: 3

Views: 983

Answers (1)

Val
Val

Reputation: 217274

If you're ready to switch to using bool/should queries, you can split the match on each field and use named queries, then in the results you'll get the name of the query that matched.

It goes basically like this: in a bool/should query, you add one query_string query per field and name the query so as to identify that field (e.g. title_query for the title field, etc)

{
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "fields": [
              "title^10"
            ],
            "query": "query_string",
            "use_dis_max": false,
            "_name": "title_query"
          }
        },
        {
          "query_string": {
            "fields": [
              "keywords^4"
            ],
            "query": "query_string",
            "use_dis_max": false,
            "_name": "keywords_query"
          }
        },
        {
          "query_string": {
            "fields": [
              "content"
            ],
            "query": "query_string",
            "use_dis_max": false,
            "_name": "content_query"
          }
        }
      ]
    }
  }
}

In the results, you'll then get below the _source another array called matched_queries which contains the name of the query that matched the returned document.

"_source": {
    ...
},
"matched_queries": [
    "title_query"
],

Upvotes: 7

Related Questions