Reputation: 397
I have an index with 20 different fields. I need to be able to pull unique docs where combination of fields "cat" and "sub" are unique. In SQL it would look this way: select unique cat, sub from table A; I can do it for one field this way:
{
"size": 0,
"aggs" : {
"unique_set" : {
"terms" : { "field" : "cat" }
}
}}
but how do I add another field to check uniqueness across two fields?
Thanks,
Upvotes: 6
Views: 6917
Reputation: 985
SQL's SELECT DISTINCT [cat], [sub]
can be imitated with a Composite Aggregation.
{
"size": 0,
"aggs": {
"cat_sub": {
"composite": {
"sources": [
{ "cat": { "terms": { "field": "cat" } } },
{ "sub": { "terms": { "field": "sub" } } }
]
}
}
}
}
Returns...
"buckets" : [
{
"key" : {
"cat" : "a",
"sub" : "x"
},
"doc_count" : 1
},
{
"key" : {
"cat" : "a",
"sub" : "y"
},
"doc_count" : 2
},
{
"key" : {
"cat" : "b",
"sub" : "y"
},
"doc_count" : 3
}
]
Upvotes: 4
Reputation: 137
Quote:
I need to be able to pull unique docs where combination of fields "cat" and "sub" are unique.
This is nonsense; your question is unclear. You can have 10s unique pairs {cat, sub}, and 100s unique triplets {cat, sub, field_3}, and 1000s unique documents Doc{cat, sub, field3, field4, ...}.
If you are interested in document counts per unique pair {"Category X", "Subcategory Y"} then you can use Cardinality aggregations. For two or more fields you will need to use scripting which will come with performance hit.
Example:
{
"aggs" : {
"multi_field_cardinality" : {
"cardinality" : {
"script": "doc['cats'].value + ' _my_custom_separator_ ' + doc['sub'].value"
}
}
}
}
Alternate solution: use nested Terms terms aggregations.
Upvotes: -3
Reputation: 383
The only way to solve this are probably nested aggregations:
{
"size": 0,
"aggs" : {
"unique_set_1" : {
"terms" : {
"field" : "cats"
},
"aggregations" : {
"unique_set_2": {
"terms": {"field": "sub"}
}
}
}
}
}
Upvotes: 1