OpenSearch: how to use calculated field in search query?

I want to search by calculated field.

For example, I calculate field by script & want to call it in query after, like in example below.

POST /process-instance/_search
{
  "script_fields": {
    "duration": {
      "script": {
        "source": 
          """
            doc['completedAt'].value.toInstant().toEpochMilli() - doc['createdAt'].value.toInstant  ().toEpochMilli()
          """,
        "lang": "painless"
      }
    }
  },
  "query": {
    "range": {
      //using calculated field 'duration' in search query
      "duration": {
        "gte": 1000
      }
    }
  }
}

Field 'duration' calculates and returns in result for every document:

POST /process-instance/_search
{
  "script_fields": {
    "duration": {
      "script": {
        "source": 
          """
            doc['completedAt'].value.toInstant().toEpochMilli() - doc['createdAt'].value.toInstant  ().toEpochMilli()
          """,
        "lang": "painless"
      }
    }
  }
}

--- Result:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "process-instance",
        "_id": "cf30102e-e945-4053-8e03-2d7fc4a4517a",
        "_score": 1,
        "fields": {
          "duration": [
            60227000
          ]
        }
      }
    ]
  }
}

But when I use it in search query (first snippet), result is empty.

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

Have you any idea, how to use calculated field in search query?

Upvotes: 0

Views: 756

Answers (2)

rishabh
rishabh

Reputation: 1

You can make use of derived fields available in latest release https://opensearch.org/docs/latest/field-types/supported-field-types/derived/

POST /process-instance/_search
{
  "derived": {
    "duration": {
      "type": "long",
      "script": {
        "source": 
          """
            emit(doc['completedAt'].value.toInstant().toEpochMilli() - doc['createdAt'].value.toInstant  ().toEpochMilli())
          """
      }
    }
  },
  "query": {
    "range": {
      "duration": {
        "gte": 1000
      }
    }
  },
  "fields": ["duration"]
}

Make sure you enable search.allow_expensive_queries before using this feature.

Upvotes: 0

Musab Dogan
Musab Dogan

Reputation: 3680

It is not possible to filter by fields defined with script_fields because the script_fields are calculated during final stage of search (fetch phase). Similar discussion in here.

However, you can use parameters like greater than X in script.

... doc['createdAt'].value.toInstant().toEpochMilli() < 1000
  • Note: Go to bottom for recommended approach (ingest pipeline).

  • Note2: Elasticsearch users can use runtime fields.

Here is an of adding greater than X

PUT process-instance/_doc/1
{
  "createdAt": "2024-06-05T10:59:19.000Z",
  "completedAt": "2024-06-05T11:59:19.000Z"
}

POST /process-instance/_search
{
  "script_fields": {
    "duration": {
      "script": {
        "source": 
          """
            //calculated field 'duration' as minutes => results is 60.
            //filter added, greater or equal to 70. If the condition satisfied it returns *true*, otherwise it returns *false*
            (doc['completedAt'].value.toInstant().toEpochMilli() -  doc['createdAt'].value.toInstant().toEpochMilli())/1000/60 <= 70
          """,
        "lang": "painless"
      }
    }
  }
}
###
Respond: "duration": true

Recommended approach:

Use ingest pipeline, calculate the duration inadvance and save as a seperated field. In that way, the search latency will be better and the operation will be easier. For more information you can read this blog: https://www.elastic.co/blog/calculating-ingest-lag-and-storing-ingest-time-in-elasticsearch-to-improve-observability

Upvotes: 0

Related Questions