ttarik
ttarik

Reputation: 3853

Elasticsearch boost documents where fulltext query substring exactly matches field

I'm using Elasticsearch 8.6 and I'm trying to boost documents in a full-text search where the value of a field is exactly contained within the query text. In other words, where a field is an exact substring of the query.

For example, say I have these documents:

{
  "id": 1,
  "title": "Kids t-shirts",
  "category": "Kids clothes"
},
{
  "id": 2,
  "title": "All about kids",
  "category": ""
}

Let's say I search for kids based on the title field, and the docs are returned in the order #2, #1. That's good so far!

Here's where I'm stuck:

If the query contains the exact name of a category - for example, blue kids clothes, I'd like to boost the document where the category matches the query - in this case, #1.

I've tried boosting with an extra match or match_phrase query on the category field but it works the wrong way around. For example, now searching for kids will also trigger the boost for document #1 because the match score between kids and the category kids clothes is positive.

An exact match doesn't work for the same reason - it's the wrong way around. I want blue kids clothes to trigger the boost based on the category kids clothes being an exact substring of the query, but an exact match would only boost if the query was exactly kids clothes.

To clarify, here are some cases where I want the boost to trigger:

Query text Should boost doc #1?
kids clothes Yes (exact category matches query)
blue kids clothes Yes (exact category is contained within query)
kids No (no exact match)
clothes for kids No (both words from category are in query but wrong order = no exact match)

(Doc #2 should never be boosted because its category field is empty)

Any advice or direction would be really appreciated!

Upvotes: 0

Views: 103

Answers (1)

ttarik
ttarik

Reputation: 3853

One way I was able to do this was with a custom script_score:

{
  "script_score": {
    "query": {
      "match": {
        "category": "blue kids clothes"
      }
    },
    "script": {
      "source": "params.query.contains(doc['category.keyword'].value) ? 1.0 : 0.0",
      "params": {
        "query": "blue kids clothes"
      }
    }
  }
}

Happy to hear any other answers!

Upvotes: 0

Related Questions