Reputation: 762
I have stored below type of nested data on my index test_agg in ES.
{
"Date": "2015-10-21",
"Domain": "abc.com",
"Processed_at": "10/23/2015 9:47",
"Events": [
{
"Name": "visit",
"Count": "188",
"Value_Aggregations": [
{
"Value": "red",
"Count": "100"
}
]
},
{
"Name": "order_created",
"Count": "159",
"Value_Aggregations": [
{
"Value": "$125",
"Count": "50"
}
]
},
]
}
mapping of the nested item is
curl -XPOST localhost:9200/test_agg/nested_evt/_mapping -d '{
"nested_evt":{
"properties":{
"Events": {
"type": "nested"
}
}
}
}'
I am trying to get "Events.Count" and "Events.Value_Aggregations.Count" where Events.Name='Visit' using the below query
{
"fields" : ["Events.Count","Events.Value_Aggregations.Count"]
"query": {
"filtered": {
"query": {
"match": { "Domain": "abc.com" }
},
"filter": {
"nested": {
"path": "Events",
"query": {
"match": { "Events.Name": "visit" }
},
}
}
}
}
}
instead of resulting single value
Events.Count=[188] Events.Value_Aggregations.Count=[100]
it gives
Events.Count=[188,159] Events.Value_Aggregations.Count=[100,50]
what is the exact query structure to get my desired output?
Upvotes: 1
Views: 8176
Reputation: 762
here is the parent/child relationship query which resulted my desired output
{
"query": {
"filtered": {
"query": {
"bool": {"must": [
{"term": {"Name": "visit"}}
]}
},
"filter":{
"has_parent": {
"type": "domain_info",
"query" : {
"filtered": {
"query": { "match_all": {}},
"filter" : {
"and": [
{"term": {"Domain": 'abc.com'}}
]
}
}
}
}
}
}
}
}
Upvotes: 1
Reputation: 8718
So the problem here is that the nested
filter you are applying selects parent documents based on attributes of the nested child documents. So ES finds the parent document that matches your query (based on the document's nested children). Then, instead of returning the entire document, since you have specified "fields"
it picks out only those fields that you have asked for. Those fields happen to be nested fields, and since the parent document has two nested children, it finds two values each for the fields you specified and returns them. To my knowledge there is no way to return the child documents instead, at least with a nested
architecture.
One solution to this problem would be to use the parent/child relationship instead, then you could use a has_parent
query in combination with the other filters, against the child type to get what you want. That would probably be a cleaner way to do this, as long as the schema architecture doesn't conflict with your other needs.
However, there is a way to do sort of what you are asking, with your current schema, with a nested aggregation combined with a filter aggregation. It's kind of involved (and slightly ambiguous in this case; see explanation below), but here's the query:
POST /test_index/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"match": {
"Domain": "abc.com"
}
},
"filter": {
"nested": {
"path": "Events",
"query": {
"match": {
"Events.Name": "visit"
}
}
}
}
}
},
"aggs": {
"nested_events": {
"nested": {
"path": "Events"
},
"aggs": {
"filtered_events": {
"filter": {
"term": {
"Events.Name": "visit"
}
},
"aggs": {
"events_count_terms": {
"terms": {
"field": "Events.Count"
}
},
"value_aggregations_count_terms": {
"terms": {
"field": "Events.Value_Aggregations.Count"
}
}
}
}
}
}
}
}
which returns:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"hits": []
},
"aggregations": {
"nested_events": {
"doc_count": 2,
"filtered_events": {
"doc_count": 1,
"value_aggregations_count_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "100",
"doc_count": 1
}
]
},
"events_count_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "188",
"doc_count": 1
}
]
}
}
}
}
}
Caveat: it's not clear to me whether you actually need the "filter": { "nested": { ... } }
clause of the "query"
in what I've shown here. If this part filters out parent documents in a useful way, then you need it. If your only intention was to select which nested child documents from which to return fields, then it's redundant here since the filter
aggregation is taking care of that part.
Here is the code I used to test it:
http://sense.qbox.io/gist/dcc46e50117031de300b6f91c647fe9b729a5283
Upvotes: 7