HelloPablo
HelloPablo

Reputation: 625

Elasticsearch: Conditionally filter query on fields if they exist in multi-index query

I have a query for a general search which spans multiple indices. Some of the indices have a field called is_published and some have a field called date_review, some have both.

I'm struggling to write a query which will search across fields and filter on the fields mentioned above but only if they exist. I have managed to achieve what I want on the individual fields using missing and/or exists, but it excludes the other variants.

In english, I want to keep documents in the result where:

  1. is_published is true OR the field does not exist
  2. date_review is in the future OR the field does not exist

So, if a document has is_published and it's false, remove it. If a document has date_review in the past, remove it. If it has is_published == false and date_review is in the future, remove it.

I hope this makes sense?

For the purpose of answering, assume the documents might look like this:

//  Has `is_published` flag
{
    "label": "My document",
    "body": "Lorem ipsum doler et sum.",
    "is_published": true
}

//  Has `date_review` flag
{
    "label": "My document",
    "body": "Lorem ipsum doler et sum.",
    "date_review": "2017-01-01"
}


//  Has both `is_published` and `date_review` flags
{
    "label": "My document",
    "body": "Lorem ipsum doler et sum.",
    "is_published": true
    "date_review": "2017-01-01"
}

At the moment, my [unfiltered] query looks like this:

{
  "index": "index-1,index-2,index-3",
  "type": "item",
  "body": {
    "query": {
      "filtered": {
        "query": {
          "multi_match": {
            "query": "my serach phrase",
            "type": "phrase_prefix",
            "fuzziness": null,
            "fields": [
              "label^3",
              "body",
            ]
          }
        },
        "filter": []
      }
    }
  }
}

Very grateful for any pointers.

Thanks.

Upvotes: 1

Views: 1853

Answers (1)

Val
Val

Reputation: 217314

You can try a query like this one:

{
  "query": {
    "filtered": {
      "query": {
        "multi_match": {
          "query": "my serach phrase",
          "type": "phrase_prefix",
          "fuzziness": null,
          "fields": [
            "label^3",
            "body"
          ]
        }
      },
      "filter": {
        "bool": {
          "must": [
            {
              "bool": {
                "minimum_should_match": 1,
                "should": [
                  {
                    "missing": {
                      "field": "is_published"
                    }
                  },
                  {
                    "term": {
                      "is_published": true
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "minimum_should_match": 1,
                "should": [
                  {
                    "missing": {
                      "field": "date_review"
                    }
                  },
                  {
                    "range": {
                      "date_review": {
                        "gt": "now"
                      }
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    }
  }
}

Upvotes: 1

Related Questions