Joost VanDorp
Joost VanDorp

Reputation: 348

How do I create an "OR" filter using elasticsearch-dsl-py?

The query below is what I would like to construct using elasticsearch-dsl-py, but I do not know how to do it.

GET /my_index/_search
{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "status": "published"
              }
            },
            {
              "or": {
                "filters": [
                  {
                    "range": {
                      "start_publication": {
                        "lte": "2015-02-17T03:45:00.245012+00:00"
                      }
                    }
                  },
                  {
                    "missing": {
                      "field": "start_publication"
                    }
                  }
                ]
              }
            },
            {
              "or":{
                "filters": [
                  {
                    "range": {
                      "end_publication": {
                        "gte": "2015-02-17T03:45:00.245012+00:00"
                      }
                    }
                  },
                  {
                    "missing": {
                      "field": "end_publication"
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    }
  }
}

Using elasticsearch-dsl-py, this is as close as I can get, but it is not the same. The '|' operator is turns into 'should' clauses, instead of 'OR'.

    client = Elasticsearch()
    now = timezone.now()

    s = Search(using=client,
               index="my_index"
        ).filter(
            "term", status=PUBLISHED
        ).filter(
            F("range", start_publication={"lte": now}, ) |
            F("missing", field="start_publication")
        ).filter(
            F("range", end_publication={"gte": now}, ) |
            F("missing", field="end_publication")
        )
    response = s.execute()

Upvotes: 9

Views: 15181

Answers (2)

Michael
Michael

Reputation: 711

With Elasticsearch 2.x (and elasticsearch-dsl > 2.x) you can't apply filters as in @theslow1's comment anymore. Instead you have to construct your filter by combining Qs:

search = Search(using=esclient, index="myIndex")
firstFilter = Q("match", color='blue') & Q("match", status='published')
secondFilter = Q("match", color='yellow') & Q("match", author='John Doe')
combinedFilter = firstFilter | secondFilter
search = search.query('bool', filter=[combinedFilter])

The search.query('bool', filter=[combinedQ]) applies the Q-criteria as filter as described in the elasticsearch-dsl documentation.

Upvotes: 7

Joost VanDorp
Joost VanDorp

Reputation: 348

Solution:

s = Search(using=client,
           index="my_index"
    ).filter(
        "term", status=PUBLISHED
    ).filter(
        "or", [F("range", start_publication={"lte": now}, ),
               F("missing", field="start_publication")]
    ).filter(
        "or", [F("range", end_publication={"gte": now}, ),
               F("missing", field="end_publication")]
    )

Which turns into:

{  
   "query":{  
      "filtered":{  
         "filter":{  
            "bool":{  
               "must":[  
                  {  
                     "term":{  
                        "status":"published"
                     }
                  },
                  {  
                     "or":{  
                        "filters":[  
                           {  
                              "range":{  
                                 "start_publication":{  
                                    "lte":"2015-02-17T03:45:00.245012+00:00"
                                 }
                              }
                           },
                           {  
                              "missing":{  
                                 "field":"start_publication"
                              }
                           }
                        ]
                     }
                  },
                  {  
                     "or":{  
                        "filters":[  
                           {  
                              "range":{  
                                 "end_publication":{  
                                    "gte":"2015-02-17T03:45:00.245012+00:00"
                                 }
                              }
                           },
                           {  
                              "missing":{  
                                 "field":"end_publication"
                              }
                           }
                        ]
                     }
                  }
               ]
            }
         },
         "query":{  
            "match_all":{  

            }
         }
      }
   }
}

Hopefully this can be included in the elasticsearch-dsl-py documentation in the future.

Upvotes: 7

Related Questions