bigpotato
bigpotato

Reputation: 27507

ElasticSearch: How to write query where string field is either null or empty?

I want to check for documents that have media_url == '' || media_url == null. I have a query:

{
    "engagements": [
        "blah"
    ],
    "query": {
        "from": 0,
        "size": 2,
        "sort": [
            {
                "bookmarked": {
                    "order": "desc"
                }
            },
            {
                "created_at": {
                    "order": "desc"
                }
            }
        ],
        "facets": {},
        "query": {
            "filtered": {
                "query": {
                    "match_all": {}
                },
                "filter": {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "car_id": "78778"
                                }
                            },
                            {
                                "range": {
                                    "created_at": {
                                        "gte": "2015-04-12T04:00:00.000Z",
                                        "lte": "2015-05-13T03:59:59.999Z"
                                    }
                                }
                            },
                            {
                                "term": {
                                    "media_url": ""
                                }
                            }
                        ],
                        "should": [
                            {
                                "term": {
                                    "bookmarked": false
                                }
                            }
                        ]
                    }
                }
            }
        },
        "aggregations": {
            "word_frequencies": {
                "terms": {
                    "field": "text",
                    "size": 150
                }
            }
        },
        "highlight": {
            "fields": {
                "text": {
                    "fragment_size": 1500
                }
            }
        }
    },
    "api": "_search"
}

However, if I do what I do above, then records that are set to null wouldn't be returned. What should I do to return records with either '' or null as their media_url value?

Upvotes: 10

Views: 25447

Answers (2)

Prabin Meitei
Prabin Meitei

Reputation: 2000

You can use the missing filter to take care of null value or field itself is missing. You can combine the same with an empty string term to achieve what you want.

{ 
   "or": [
    {
      "term": {
        "media_url": ""
      }
    },
    {
      "missing": {
        "field": "media_url"
      }
    }   
    ]
}

Use the above instead of the single term query for "media_url" in the must clause of your Boolean filter.

Upvotes: 5

phuc77
phuc77

Reputation: 6915

Perhaps you can try using the "or" filter. http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-or-filter.html

{
  "or": [
    {
      "term": {
        "media_url": ""
      }
    },
    {
      "term": {
        "media_url": null
      }
    }
  ]
}

Edit: Here's the full query (untested since I don't have an example document/index template)

{
    "engagements": [
        "blah"
    ],
    "query": {
      "from": 0,
      "size": 2,
      "sort": [
         {
            "bookmarked": {
               "order": "desc"
            }
         },
         {
            "created_at": {
               "order": "desc"
            }
         }
      ],
      "facets": {},
      "query": {
         "filtered": {
            "query": {
               "match_all": {}
            },
            "filter": {
               "bool": {
                  "must": [
                     {
                        "term": {
                           "car_id": "78778"
                        }
                     },
                     {
                        "range": {
                           "created_at": {
                              "gte": "2015-04-12T04:00:00.000Z",
                              "lte": "2015-05-13T03:59:59.999Z"
                           }
                        }
                     },
                     {
                        "or": [
                           {
                              "term": {
                                 "media_url": ""
                              }
                           },
                           {
                              "term": {
                                 "media_url": null
                              }
                           }
                        ]
                     }
                  ],
                  "should": [
                     {
                        "term": {
                           "bookmarked": false
                        }
                     }
                  ]
               }
            }
         }
      },
      "aggregations": {
         "word_frequencies": {
            "terms": {
               "field": "text",
               "size": 150
            }
         }
      },
      "highlight": {
         "fields": {
            "text": {
               "fragment_size": 1500
            }
         }
      }
   },
   "api": "_search"
}

Upvotes: 7

Related Questions