xplat
xplat

Reputation: 8624

How to ignore the nested objects that have null value or don't exist

I have the below aggregations query.

{
    "aggs": {
        "selected_brand": {
            "filter": {
                "term": {
                    "brandId": "b1d28821-3730-4266-8f55-eb69596004fb"
                }
            }
        },
        "sponsorshipSets": {
            "nested": {
                "path": "sponsorshipSets"
            },
            "aggs": {
                "sponsorships": {
                    "nested": {
                        "path": "sponsorshipSets.sponsorships"
                    },
                    "aggs": {
                        "count": {
                            "value_count": {
                                "field": "sponsorshipSets.sponsorships.id"
                            }
                        }
                    }
                }
            }
        }
    }
}

The response is the below.

{
  "hits": {
    "total": {
      "value": 2980,
      "relation": "eq"
    }
  },
  "aggregations": {
    "selected_brand": {
      "doc_count": 314
    },
    "sponsorshipSets": {
      "doc_count": 2635,
      "sponsorships": {
        "doc_count": 1076,
        "count": {
          "value": 1076
        }
      }
    }
  }
}

The response shows the count of sponsorship documents is 1076, now I want to retrieve the documents these documents and tried with the below query.

{
    "query": {
        "bool": {
            "must": {
                "nested": {
                    "path": "sponsorshipSets",
                    "query": {
                        "nested": {
                            "path": "sponsorshipSets.sponsorships",
                            "query": {
                                "bool": {
                                    "must_not": [
                                        {
                                            "match": {
                                                "sponsorshipSets.sponsorships": "null"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    }
                }
            },
            "filter": [
                {
                    "term": {
                        "brandId": "b1d28821-3730-4266-8f55-eb69596004fb"
                    }
                }
            ]
        }
    }
}

The interesting thing for the second query is the hits below is only 82.

"hits": {
    "total": {
      "value": 82,
      "relation": "eq"
    },

What I really want is to retrieve the count of all the sponsorshipSets.sponsorships documents that are not null or not exist. SponsorshipSets can be missing as well.

Find below the abbreviated template.

{
  "index_patterns": "campaigns*",
  "order": 4,
  "version": 4,
  "aliases": {
    "campaigns": {

    }
  },
  "settings": {
    "number_of_shards": 5
  },
  "mappings": {
    "dynamic": "false",
    "properties": {
      "brandId": {
        "type": "keyword"
      },
      "sponsorshipSets": {
        "type": "nested",
        "properties": {
          "id": {
            "type": "keyword"
          },
          "sponsorships": {
            "type": "nested",
            "properties": {
              "id": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }

Upvotes: 0

Views: 731

Answers (1)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

You can use exists query to filter such documents. Below query should help.

Query Request:

POST <your_index_name>/_search
{
   "query":{
      "bool":{
         "must":{
            "nested":{
               "path":"sponsorshipSets",
               "query":{
                  "nested":{
                     "path":"sponsorshipSets.sponsorships",
                     "query":{
                        "bool":{
                           "must_not":[
                              {
                                 "exists":{
                                    "field":"sponsorshipSets.sponsorships"
                                 }
                              }
                           ]
                        }
                     }
                  }
               }
            }
         },
         "filter":[
            {
               "term":{
                  "brandId":"b1d28821-3730-4266-8f55-eb69596004fb"
               }
            }
         ]
      }
   }
}

This should return all three scenarios where your document JSON structure would be

  • sponsorshipSets.sponsorships: {} i.e. you have empty structure for sponsorships

  • sponsorshipSets.sponsorships: null i.e. the value is set as null

  • Or your document doesn't have sponsorships field in first place.

You don't need to use any aggregations for this as ES would return you the count of such documents in hits.total.value of the response.

Let me know if this helps!

Upvotes: 2

Related Questions