n1ru4l
n1ru4l

Reputation: 498

Boost score of term query result with multiple matches

I have serveral documents that look like the following stored in my elastic search index:

PUT tests
{
  "mappings": {
    "_doc": {
      "dynamic": false,
      "properties": {
        "objects": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        },
        "text": {
          "type": "text"
        }
      }
    }
  }
}

PUT tests/_doc/1
{
  "text": "lel",
  "objects": ["A"]
}

PUT tests/_doc/2
{
  "text": "lol",
  "objects": ["B"]
}

PUT tests/_doc/3
{
  "text": "lil",
  "objects": ["C"]
}

PUT tests/_doc/4
{
  "text": "lul",
  "objects": ["A", "B", "C"]
}

I want to query for objects with the following query:

GET _search
{

  "query": {
    "terms": {
      "objects.keyword": ["A", "B", "C"]
    }
  }
}

The result includes all three sample objects I provided.

My question is simply whether I can make an object appear of a higher importance (boost) that has a full match (all keywords in the objects array) and not just only a partial match and if so how, since I could not find any information in the elastic search documentation.

This is the result I am currently receiving:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 11,
    "successful": 11,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1,
    "hits": [
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "2",
        "_score": 1,
        "_source": {
          "text": "lol",
          "objects": [
            "B"
          ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "4",
        "_score": 1,
        "_source": {
          "text": "lul",
          "objects": [
            "A",
            "B",
            "C"
          ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "1",
        "_score": 1,
        "_source": {
          "text": "lel",
          "objects": [
            "A"
         ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "3",
        "_score": 1,
        "_source": {
          "text": "lil",
          "objects": [
            "C"
          ]
        }
      }
    ]
  }
}

Upvotes: 4

Views: 2927

Answers (1)

wholevinski
wholevinski

Reputation: 3828

I think your best bet is using a bool query with should and minimum_should_match: 1.

GET _search
{

  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "objects.keyword": "A" 
          }
        },
        {
          "term": {
            "objects.keyword": "B" 
          }
        },
        {
          "term": {
            "objects.keyword": "C" 
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

Results:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 1.5686159,
    "hits": [
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "4",
        "_score": 1.5686159,
        "_source": {
          "text": "lul",
          "objects": [
            "A",
            "B",
            "C"
          ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "text": "lel",
          "objects": [
            "A"
          ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.2876821,
        "_source": {
          "text": "lil",
          "objects": [
            "C"
          ]
        }
      },
      {
        "_index": "tests",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.18232156,
        "_source": {
          "text": "lol",
          "objects": [
            "B"
          ]
        }
      }
    ]
  }
}

EDIT: Here's why, as explained by the docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html):

The bool query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document.

Upvotes: 5

Related Questions