Reputation: 483
I would like a query which it returns the number of times a field is repeated, according to the unique value of another field I have this json:
"name" : james,
"city" : "chicago" <----------- same
},
{
"name" : james,
"city" : "san francisco"
},
{
"name" : james,
"city" : "chicago" <-----------same
},
{
"name" : Mike,
"city" : "chicago"
},
{
"name" : Mike,
"city" : "texas"<-----------same
},
{
"name" : Mike,
"city" : "texas"<-----------same
},
{
"name" : Peter,
"city" : "chicago"
},
I want to make a query where I count based on the unique value of two fields. For example, james is equal to 2, because there are two equal fields (name: james, city, chicago) and a different field (name: james, city: san francisco) The output would then be the following:
{
"key" : "james",
"doc_count" : 2
},
{
"key" : "Mike",
"doc_count" : 2
},
{
"key" : "Peter",
"doc_count" : 1
},
It is possible to do a single value count of two fields?
Upvotes: 2
Views: 6884
Reputation: 483
This was the solution that solved the problem for me
GET test/_search?filter_path=aggregations.count
{
"size": 0,
"aggs": {
"names": {
"terms": {
"script": {
"source": "return doc['name.keyword'].value + ' ' + doc['city.keyword'].value",
"lang": "painless"
},
"field": "name.keyword",
"size": 10,
"min_doc_count": 2
}
},
"count":{
"cardinality": {"script": "return doc['name.keyword'].value + ' ' + doc['city.keyword'].value"
}
}
}
}
Output:
{
"aggregations" : {
"count" : {
"value" : 2
}
}
}
Upvotes: 0
Reputation: 2089
You can do a two level terms aggregation:
{
"size": 0,
"aggs": {
"names": {
"terms": {
"field": "name.keyword",
"size": 10
},
"aggs": {
"citys_by_name": {
"terms": {
"field": "city.keyword",
"size": 10,
"min_doc_count": 2
}
}
}
}
}
}
The response will looks like this:
"aggregations" : {
"names" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "james",
"doc_count" : 15,
"citys_by_name" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "chicago",
"doc_count" : 14
}
]
}
},
{
"key" : "Peter",
"doc_count" : 2,
"citys_by_name" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "chicago",
"doc_count" : 2
}
]
}
},
{
"key" : "mike",
"doc_count" : 2,
"citys_by_name" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
]
}
}
Or you can concatenate fields:
GET test/_search
{
"size": 0,
"aggs": {
"names": {
"terms": {
"script": {
"source": "return doc['name.keyword'].value + ' ' + doc['city.keyword'].value",
"lang": "painless"
},
"field": "name.keyword",
"size": 10,
"min_doc_count": 2
}
}
}
}
The response will looks lie this:
"aggregations" : {
"names" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "james chicago",
"doc_count" : 14
},
{
"key" : "Peter chicago",
"doc_count" : 2
}
]
}
}
If you want more stats on buckets, use the stats_buckets aggregation:
{
"size": 0,
"aggs": {
"names": {
"terms": {
"script": {
"source": "return doc['name.keyword'].value + ' ' + doc['city.keyword'].value",
"lang": "painless"
},
"field": "name.keyword",
"size": 10,
"min_doc_count": 2
}
},
"names_stats":{
"stats_bucket": {
"buckets_path":"names._count"
}
}
}
}
Will result:
"aggregations" : {
"names" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "james PARIS",
"doc_count" : 15
},
{
"key" : "james chicago",
"doc_count" : 13
},
{
"key" : "samuel PARIS",
"doc_count" : 11
},
{
"key" : "fred PARIS",
"doc_count" : 2
}
]
},
"names_stats" : {
"count" : 4,
"min" : 2.0,
"max" : 15.0,
"avg" : 10.25,
"sum" : 41.0
}
}
Upvotes: 5