Reputation: 4732
How can I write an ElasticSearch term aggregation query that takes into account the entire field value, rather than individual tokens? For example, I would like to aggregate by city name, but the following returns new
, york
, san
and francisco
as individual buckets, not new york
and san francisco
as the buckets as expected.
curl -XPOST "http://localhost:9200/cities/_search" -d'
{
"size": 0,
"aggs" : {
"cities" : {
"terms" : {
"field" : "city",
"min_doc_count": 10
}
}
}
}'
Upvotes: 12
Views: 7854
Reputation: 2318
Update at 2018-02-11
now we can use syntax .keyword
after grouped by field according to this
GET /bank/_search
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state.keyword"
}
}
}
}
Upvotes: 6
Reputation: 5443
This elastic doc suggests to fix that in mapping (as suggested in the accepted answer) - either to make the field not_analyzed
or to add a raw field with not_analyzed
and use it in aggregations.
There is no other way for it. As the aggregations operate upon inverted index and if the field is analyzed, the inverted index is bound to have only tokens and not the original values of the field.
Upvotes: 1
Reputation: 4733
You should fix this in your mapping. Add a not_analyzed field. You can create the multi field if you also need the analyzed version.
"album": {
"city": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
Now create your aggregate on city.raw
Upvotes: 16