Reputation: 6315
Given the following index definition and query:
curl -XDELETE "localhost:9200/products"
curl -XPUT "localhost:9200/products"
curl -XPUT "localhost:9200/products/_mapping" -H 'Content-Type: application/json' -d'
{
"properties": {
"opinions": {
"type": "nested",
"properties": {
"topic": {"type": "keyword"},
"count": {"type": "long"}
},
"include_in_parent": true
}
}
}'
curl -X POST "localhost:9200/products/_bulk" -H 'Content-Type: application/json' -d'
{"index":{"_id":1}}
{"opinions":[{"topic": "room", "count": 2}, {"topic": "kitchen", "count": 1}]}
{"index":{"_id":2}}
{"opinions":[{"topic": "room", "count": 1}, {"topic": "restroom", "count": 1}]}
'
sleep 2
curl -X POST "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"per_topic": {
"terms": {"field": "opinions.topic"},
"aggs": {
"counts": {
"sum": {"field": "opinions.count"}
}
}
}
}
}
'
Produces the result:
"aggregations" : {
"per_topic" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room",
"doc_count" : 2,
"counts" : {
"value" : 5.0
}
},
{
"key" : "kitchen",
"doc_count" : 1,
"counts" : {
"value" : 3.0
}
},
{
"key" : "restroom",
"doc_count" : 1,
"counts" : {
"value" : 2.0
}
}
]
}
}
}
I'm expecting the sum of room to be 3, kitchen to be 1 and restroom to be 1, counting only the related nested documents, but instead it is summing all the nested count fields in all the matched the documents.
How can I sum only the matched aggregated nested documents?
UPDATE: solution based on comments
curl -X POST "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"opinions": {
"nested": {"path": "opinions"},
"aggs": {
"per_topic": {
"terms": {"field": "opinions.topic"},
"aggs": {
"counts": {
"sum": {"field": "opinions.count"}
}
}
}
}
}
}
}
'
Upvotes: 2
Views: 486
Reputation: 1072
The main initial problem was the use of object fields instead of nested fields: only using nested fields is it possible to preserve the structure [{"room", 2}, {"kitchen", 1}]
, as in object fields the data is flattened to {["room", "kitchen"], [1,2]}
without relationships between "room"
and 2
.
Unluckily, at the moment is not possible to use the SQL API to group by (some?) nested fields, but it is possible to write a native Elastic query using nested aggregations.
Upvotes: 2