Reputation: 42684
I am using Elasticsearch 7.15 and need to aggregate a field and sort them by order.
My document saved in Elasticsearch looks like:
{
"logGroup" : "/aws/lambda/myLambda1",
...
},
{
"logGroup" : "/aws/lambda/myLambda2",
...
}
I need to find out which logGroup
has the most document. In order to do that, I tried to use aggregate
in Elasticsearch:
GET /my-index/_search?size=0
{
"aggs": {
"types_count": {
"terms": {
"field": "logGroup",
"size": 10000
}
}
}
}
the output of this query looks like:
"aggregations" : {
"types_count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "aws",
"doc_count" : 26303620
},
{
"key" : "lambda",
"doc_count" : 25554470
},
{
"key" : "myLambda1",
"doc_count" : 25279201
}
...
}
As you can see from above output, it splits the logGroup
value into terms and aggregate based on term not the whole string. Is there a way for me to aggregate them as a whole string?
I expect the output looks like:
"buckets" : [
{
"key" : "/aws/lambda/myLambda1",
"doc_count" : 26303620
},
{
"key" : "/aws/lambda/myLambda2",
"doc_count" : 25554470
},
The logGroup
field in the index mapping is:
"logGroup" : {
"type" : "text",
"fielddata" : true
},
Can I achieve it without updating the index?
Upvotes: 0
Views: 329
Reputation: 217594
In order to get what you expect you need to change your mapping to this:
"logGroup" : {
"type" : "keyword"
},
Failing to do that, your log groups will get analyzed by the standard analyzer which splits the whole string and you'll not be able to aggregate by full log groups.
If you don't want or can't change the mapping and reindex everything, what you can do is the following:
First, add a keyword
sub-field to your mapping, like this:
PUT /my-index/_mapping
{
"properties": {
"logGroup" : {
"type" : "text",
"fields": {
"keyword": {
"type" : "keyword"
}
}
}
}
}
And then run the following so that all existing documents pick up this new field:
POST my-index/_update_by_query?wait_for_completion=false
Finally, you'll be able to achieve what you want with the following query:
GET /my-index/_search
{
"size": 0,
"aggs": {
"types_count": {
"terms": {
"field": "logGroup.keyword",
"size": 10000
}
}
}
}
Upvotes: 1