HelloPablo
HelloPablo

Reputation: 625

Using Elasticsearch, how do I apply function scores to documents which conditionally have a property

I have a handful of indexes, some of which have a particular date property indicating when it was published (date_publish), and others do not. I am trying to apply a gauss function to decay the score of documents which were published a long time ago. The relevant indexes are correctly configured to recognise the date_publish property as a date.

I have set up my query as follows, specifically filtering documents which do not have the property:

{
  "index": "index_contains_prop,index_does_not_contains_prop",
  "body": {
    "query": {
      "function_score": {
        "score_mode": "avg",
        "query": {
          "match_all": {}
        },
        "functions": [
          {
            "script_score": {
              "script": {
                "source": "0"
              }
            }
          },
          {
            "filter": {
              "exists": {
                "field": "date_publish"
              }
            },
            "gauss": {
              "date_publish": {
                "origin": "now",
                "scale": "728d",
                "offset": "7d",
                "decay": 0.5
              }
            }
          }
        ]
      }
    },
    "from": 0,
    "size": 1000
  }
}

However, the query errors with the following:

{
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "unknown field [date_publish]",
        "line": 1,
        "col": 0
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "index_does_not_contains_prop",
        "node": "1hfXZK4TT3-K288nIr0UWA",
        "reason": {
          "type": "parsing_exception",
          "reason": "unknown field [date_publish]",
          "line": 1,
          "col": 0
        }
      }
    ]
  },
  "status": 400
}

I have RTFM'd many times, and i can't see any discrepancy - I ahve also tried wrapping the exists condition in a bool:must object, to no avail.

Have I misunderstood the purpose of the filter argument?

Upvotes: 0

Views: 515

Answers (1)

Doron Yaacoby
Doron Yaacoby

Reputation: 9770

The exists query will only work on fields that are part of the index mapping. It will return only documents that have a value for this field, but the field itself still needs to be defined in the mapping. This is why you're getting an error - index_does_not_contains_prop does not have date_publish mapped. You can use the put mapping API to add this field to the indexes who don't have it (it won't change any document), and then your query should work.

Upvotes: 1

Related Questions