Scott Stafford
Scott Stafford

Reputation: 44776

Can Elasticsearch do a decay search on the log of a value?

I store a number, views, in Elasticsearch. I want to find documents "closest" to it on a logarithmic scale, so that 10k and 1MM are the same distance (and get scored the same) from 100k views. Is that possible?

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#exp-decay describes field value factor and decay functions but can they be "stacked"? Is there another approach?

Upvotes: 1

Views: 364

Answers (1)

Dusty
Dusty

Reputation: 3971

I'm not sure if you can achieve this directly with decay, but you could easily do it with the script_score function. The example below uses dynamic scripting, but please be aware that using file-based scripts is the recommended, far more secure approach.

In the query below, the offset parameter is set to 100,000, and documents with that value for their 'views' field will score the highest. Score decays logarithmically as the value of views departs from offset. Per your example, documents with 1,000,000 and/or 10,000 have identical scores (0.30279312 in this formula).

You can invert the order of these results by changing the beginning of the script to multiply by _score instead of divide.

$ curl -XPOST localhost:9200/somestuff/_search -d '{
  "size": 100,
  "query": {
    "bool": {
      "must": [
        {
          "function_score": {
            "functions": [
              {
                "script_score": {
                  "params": {
                    "offset": 100000
                  },
                  "script": "_score / (1 + ((log(offset) - log(doc['views'].value)).abs()))"
                }
              }
            ]
          }
        }
      ]
    }
  }
}'

Note: you may want to account for the possibility of 'views' being null, depending on your data.

Upvotes: 2

Related Questions