Alexxandar
Alexxandar

Reputation: 994

Elasticsearch sort results from several indexes so that one index has priority

I have 6 websites, lets call them A, B, C, D, E & M. M is the master website because from it you can search the contents of others, this I've done easily by using putting all indexes separated by comma in the search query.

However I have a new requirement now, that from every website you can search all websites(easy to do, apply solution from M to all), BUT give priority to results from the current website.

So If I'm searching from C, first results should be from C and then from others based on score.

Now, how do I give results from one index priority over the rest?

Upvotes: 1

Views: 884

Answers (1)

rusnyder
rusnyder

Reputation: 831

A boosting query serves this purpose well:

Sample data

POST /_bulk
{"index":{"_index":"a"}}
{"message":"First website"}
{"index":{"_index":"b"}}
{"message":"Second website"}
{"index":{"_index":"c"}}
{"message":"Third website"}
{"index":{"_index":"d"}}
{"message":"Something irrelevant"}

Query

POST /a,b,c,d/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": {
          "message": "website"
        }
      },
      "negative": {
        "terms": {
          "_index": ["b", "c", "d"]
        }
      }, 
      "negative_boost": 0.2
    }
  }
}

Response

{
  ...
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "a",
        "_type" : "_doc",
        "_id" : "sx-DkWsBHWmGEbsYwViS",
        "_score" : 0.2876821,
        "_source" : {
          "message" : "First website"
        }
      },
      {
        "_index" : "b",
        "_type" : "_doc",
        "_id" : "tB-DkWsBHWmGEbsYwViS",
        "_score" : 0.05753642,
        "_source" : {
          "message" : "Second website"
        }
      },
      {
        "_index" : "c",
        "_type" : "_doc",
        "_id" : "tR-DkWsBHWmGEbsYwViS",
        "_score" : 0.05753642,
        "_source" : {
          "message" : "Third website"
        }
      }
    ]
  }
}

Notes

  1. The smaller you make the negative_boost, the more likely it is that results from the "active index" will win out over the other indices
  2. If you set the negative_boost to 0, you will guarantee that the "active site" results sort first, but you will discard all scores for all the other sites, so the remaining sort will be arbitrary.

I reckon something like negative_boost: 0.1, which is an order-of-magnitude adjustment on relevance, should get you what you're looking for.

Upvotes: 2

Related Questions