Reputation: 31
How can I make a search in elasticsearch for two indices that aggregates the values that occur in both indices?
For instance:
GET indexA,indexB/_search
{
"aggs": {
"myField": {
"terms": {
"field": "myField"
}
}
}
}
This way I get all the values that myField has in both indices (indexA and indexB) but how can I change this so that it only shows the values that appear both in indexA and indexB?
To clarify, if myField has values value1, value2 and value3 in indexA but it only has value1 and value2 in indexB, my search would only show value1 and value2.
Upvotes: 1
Views: 177
Reputation: 52368
You can do it like this (and you need Elasticsearch 2.x):
{
"size": 0,
"aggs": {
"myField": {
"terms": {
"field": "myField"
},
"aggs": {
"count_indices": {
"cardinality": {
"field": "_index"
}
},
"values_bucket_filter_by_index_count": {
"bucket_selector": {
"buckets_path": {
"count": "count_indices"
},
"script": "count >= 2"
}
}
}
}
}
}
With "terms": {"field": "myField"}
you get the unique myField
values. Then, as a sub-aggregation, with "cardinality": {"field": "_index"}
you count the number of indices that have that value and with the final aggregation - values_bucket_filter_by_index_count
- you keep those buckets that have at least two indices containing them.
In the end the aggregations result look like this:
"aggregations": {
"myField": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "value1",
"doc_count": 2,
"count_indices": {
"value": 2
}
},
{
"key": "value2",
"doc_count": 2,
"count_indices": {
"value": 2
}
}
]
}
}
As I mentioned you need Elasticsearch 2.x for bucket_selector
aggregation.
Upvotes: 1