Eugene
Eugene

Reputation: 3957

ElasticSearch aggregate nested fields as part of parent document

I have index with Product entities and nested Variation entities. Product entity consist of Id, Title and nested variations. Variation entity consist of Color, Size and Price fields. I need to aggregate search result by Color, Size and Price fields to get number of products for each color, size and price groups. If I use nested aggregation for these fields I get correct buckes but the number of documents in buckets is number of Variation entities per bucket. But I need to get number of Product entities (root documents) per bucket.

For example, the first product has variation (red, small, $10), (green, small, $10), (red, medium, $11) and the second product has variations (green, medium, $15). Nested aggregation returns 2 for red and 2 for small because 2 variations has red color and small size. But I need number of products (root entities) per bucket which should be 1 for red and 1 for small.

I also can't use children documents instead of nested documents because of other requirements.

How to compose the query to get this result?

Here is the mapping:

{
  "product": {
    "properties": {
      "id": {
        "type": "long"
      },
      "title": {
        "type": "string"
      },
      "brand": {
        "type": "string"
      },
      "variations": {
        "type": "nested",
        "properties": {
          "id": {
            "type": "long"
          },
          "colour": {
            "type": "string"
          },
          "size": {
            "type": "string"
          },
          "price": {
            "type": "double"
          }
        }
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

And here is a query

{
  "aggs": {
    "Variations": {
      "nested": {
        "path": "variations"
      },
      "aggs": {
        "Colous": {
          "terms": {
            "field": "variations.colour"
          }
        },
        "Sizes": {
          "terms": {
            "field": "variations.size"
          }
        }
      }
    },
    "Brands": {
      "terms": {
        "field": "brand"
      }
    }
  },
  "query": {
    "match_all": {}
  }
}

The Brand aggregation works well because it gets number of root documents per group but nested aggregations return number of nested documents instead of number of root documents.

Upvotes: 6

Views: 4080

Answers (1)

Val
Val

Reputation: 217274

You've tackled the problem the right way. Now you can simply use the reverse_nested aggregation in order to "join back" to the root product and get the count of matching products for each for your variations.

{
  "aggs": {
    "Variations": {
      "nested": {
        "path": "variations"
      },
      "aggs": {
        "Colous": {
          "terms": {
            "field": "variations.colour"
          },
          "aggs": {
            "product_count": {            <--- add this reverse nested agg
              "reverse_nested": {}
            }
          }
        },
        "Sizes": {
          "terms": {
            "field": "variations.size"
          },
          "aggs": {
            "product_count": {            <--- add this reverse nested agg
              "reverse_nested": {}
            }
          }
        }
      }
    },
    "Brands": {
      "terms": {
        "field": "brand"
      }
    }
  },
  "query": {
    "match_all": {}
  }
}

In the response, you'll see that:

  • 2 products are matching colour: green
  • 1 product is matching colour: red
  • 2 products are matching size: medium
  • 1 product is matching size: small

Upvotes: 8

Related Questions