Reputation: 537
I have the database table columns like:
ID | Biz Name | License # | Violations | ...
I need to find out those businesses that have more than 5 violations.
I have the following:
{
"query": {
"bool": {
"must": {
"match": {
"violations": {
"query": "MICE DROPPINGS were OBSERVED",
"operator": "and"
}
}
},
"must_not": {
"match": {
"violations": {
"query": "NO MICE DROPPINGS were OBSERVED",
"operator": "and"
}
}
}
}
}
},
"aggs" : {
"selected_bizs" :{
"terms" : {
"field" : "Biz Name.keyword",
"min_doc_count": 5,
"size" :1000
},
"aggs": {
"top_biz_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
It seems working.
Now I need to find out those businesses that have 5 or more violations(like above), and also have 3 or more license #s.
I am not sure how to further aggregate this.
Thanks!
Upvotes: 2
Views: 1468
Reputation: 16943
Let's assume that your License #
field is defined just like the Biz Name
and has a .keyword
mapping.
Now, the statement:
find the businesses that have ... 3 or more license #s
can be rephrased as:
aggregate by the
business name
under the condition that the number of distinct values of the bucketedlicense IDs
is greater or equal to 3.
With that being said, you can use the cardinality
aggregation to get distinct License IDs.
Secondly, the mechanism for "aggregating under a condition" is the handy bucket_script
aggregation which executes a script to determine whether the currently iterated bucket will be retained in the final aggregation.
Leveraging both of these in tandem would mean:
POST your-index/_search
{
"size": 0,
"query": {
"bool": {
"must": {
"match": {
"violations": {
"query": "MICE DROPPINGS were OBSERVED",
"operator": "and"
}
}
},
"must_not": {
"match": {
"violations": {
"query": "NO MICE DROPPINGS were OBSERVED",
"operator": "and"
}
}
}
}
},
"aggs": {
"selected_bizs": {
"terms": {
"field": "Biz Name.keyword",
"min_doc_count": 5,
"size": 1000
},
"aggs": {
"top_biz_hits": {
"top_hits": {
"size": 10
}
},
"unique_license_ids": {
"cardinality": {
"field": "License #.keyword"
}
},
"must_have_min_3_License #s": {
"bucket_selector": {
"buckets_path": {
"unique_license_ids": "unique_license_ids"
},
"script": "params.unique_license_ids >= 3"
}
}
}
}
}
}
and that's all there's to it!
Upvotes: 2