Reputation: 333
In the elasticsearch implementation , I have few simple aggregations on the basis of few fields as shown below -
"aggs" : {
"author" : {
"terms" : { "field" : "author"
, "size": 20,
"order" : { "_term" : "asc" }
}
},
"title" : {
"terms" : { "field" : "title"
, "size": 20
}
},
"contentType" : {
"terms" : { "field" : "docType"
, "size": 20
}
}
}
The aggregations work fine and I get the results accordingly. but the title key field returned (or any other field - multi word) , has single word aggregation and results. I need the full title in the returned result, rather then just a word- which doesn't make much sense. how can I get that.
Current results (just a snippet) -
"title": {
"buckets": [
{
"key": "test",
"doc_count": 1716
},
{
"key": "pptx",
"doc_count": 1247
},
{
"key": "and",
"doc_count": 661
},
{
"key": "for",
"doc_count": 489
},
{
"key": "mobile",
"doc_count": 487
},
{
"key": "docx",
"doc_count": 486
},
{
"key": "pdf",
"doc_count": 450
},
{
"key": "2012",
"doc_count": 397
} ] }
expected results -
"title": {
"buckets": [
{
"key": "test document for stack overflow ",
"doc_count": 1716
},
{
"key": "this is a pptx",
"doc_count": 1247
},
{
"key": "its another document and so on",
"doc_count": 661
},
{
"key": "for",
"doc_count": 489
},
{
"key": "mobile",
"doc_count": 487
},
{
"key": "docx",
"doc_count": 486
},
{
"key": "pdf",
"doc_count": 450
},
{
"key": "2012",
"doc_count": 397
} }
I went through a lot of documentation, it explains different ways to aggregate results, but I couldn't find how to get the full text if a field in key in result , please advise how can I achieve this?
Upvotes: 28
Views: 17639
Reputation: 141
I ran into a similar issue. When I ran the command:
curl -XGET "localhost:9200/logstash*/_mapping?pretty"
response had following in it which was useful:
"host" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},...
I realised than that adding .raw should change the output and will get the desired output.
so something like:
"aggs": {
"computes": {
"terms": {
"field": "host.raw",
"size": 0
}
}
}
Did the trick for me.
Newbie to the elasticsearch but I am seeing many field of type string has a "raw" field which can be used within query.
It would be good if some experts can shed a light on my findings. Correct/Partially correct/Wrong ?!
Upvotes: 0
Reputation: 1876
seems the multi_fields specified in the above post is deprecated http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_multi_fields.html#_multi_fields
Upvotes: 0
Reputation: 5924
You need to have untokenized copies of the terms in the index, in your mapping use multi-fields:
{
"test": {
"mappings": {
"book": {
"properties": {
"author": {
"type": "string",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
},
"title": {
"type": "string",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
},
"docType": {
"type": "string",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
In your aggregation query reference the untokenized fields:
"aggs" : {
"author" : {
"terms" : {
"field" : "author.untouched",
"size": 20,
"order" : { "_term" : "asc" }
}
},
"title" : {
"terms" : {
"field" : "title.untouched",
"size": 20
}
},
"contentType" : {
"terms" : {
"field" : "docType.untouched",
"size": 20
}
}
}
Upvotes: 30