brupm
brupm

Reputation: 1183

Elasticsearch sort based on the number of occurrences a string appears in an array

I have an array field containig a list of strings: ie.: ["NY", "CA"]

At search time I have a filter which matches any of the strings in the array.

I would like to sort the results based on documents that have the most number of appearances of the searched string: "NY"

Results should include: document 1: ["CA", "NY", "NY"] document 2: ["NY", FL"] document 3: ["NY", CA", "NY", "NY"]

Results should be ordered as such

User 3, User 1, User 2

Is this possible? If so, how?

Upvotes: 0

Views: 2606

Answers (2)

brupm
brupm

Reputation: 1183

For those curious, I was not able to boost based on how many occurrences of the word happen in the array. I did however accomplished what I needed with the following:

curl -X POST "http://localhost:9200/index/document/1" -d '{"id":1,"states_ties":["CA"],"state_abbreviation":"CA","worked_in_states":["CA"],"training_in_states":["CA"]}'
curl -X POST "http://localhost:9200/index/document/2" -d '{"id":2,"states_ties":["CA","NY"],"state_abbreviation":"FL","worked_in_states":["NY","CA"],"training_in_states":["NY","CA"]}'
curl -X POST "http://localhost:9200/index/document/3" -d '{"id":3,"states_ties":["CA","NY","FL"],"state_abbreviation":"NY","worked_in_states":["NY","CA"],"training_in_states":["NY","FL"]}'

curl -X GET 'http://localhost:9200/index/_search?per_page=10&pretty' -d '{
  "query": {
    "custom_filters_score": {
      "query": {
        "terms": {
          "states_ties": [
            "CA"
          ]
        }
      },
      "filters": [
        {
          "filter": {
            "term": {
              "state_abbreviation": "CA"
            }
          },
          "boost": 1.03
        },
        {
          "filter": {
            "terms": {
              "worked_in_states": [
                "CA"
              ]
            }
          },
          "boost": 1.02
        },
        {
          "filter": {
            "terms": {
              "training_in_states": [
                "CA"
              ]
            }
          },
          "boost": 1.01
        }
      ],
      "score_mode": "multiply"
    }
  },
  "sort": [
    {
      "_score": "desc"
    }
  ]
}'

results: id: score

1: 0.75584483
2: 0.73383
3: 0.7265643

Upvotes: 1

femtoRgon
femtoRgon

Reputation: 33341

This would be accomplished by the standard Lucene scoring implementation. If you were simply searching for "NY", without specifying an order, it will sort by relevance, and will assign highest relevance to a document with more occurances of the term, all else being equal.

Upvotes: 0

Related Questions