Reputation: 1860
I have an aggregation in elasticsearch which gives a response like this:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1261,
"max_score": 0,
"hits": []
},
"aggregations": {
"clusters": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1073,
"buckets": [
{
"key": 813058,
"doc_count": 46
},
{
"key": 220217,
"doc_count": 29
},
{
"key": 287763,
"doc_count": 23
},
{
"key": 527217,
"doc_count": 20
},
{
"key": 881778,
"doc_count": 15
},
{
"key": 700725,
"doc_count": 14
},
{
"key": 757602,
"doc_count": 13
},
{
"key": 467496,
"doc_count": 10
},
{
"key": 128318,
"doc_count": 9
},
{
"key": 317261,
"doc_count": 9
}
]
}
}
}
I want to get one document (either by top score or at random - anything works) for every bucket in the aggregation. How do I do that?
The query I am using to get the aggregation is this:
GET myindex/_search
{
"size": 0,
"aggs": {
"clusters": {
"terms": {
"field": "myfield",
"size": 100000
}
}
},
"query": {
"bool": {
"must": [
{
"query_string": { "default_field": "field1", "query": "val1" }
},
{
"query_string": { "default_field": "field2", "query": "val2" }
}
]
}
}
}
I am trying to implement a cluster based sentence similarity system and hence I need this. I pick one sentence from every cluster and check for similarity with a given sentence.
Upvotes: 2
Views: 1868
Reputation: 1860
I was able to solve it by using the top hits aggregation given here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html
The sample query below:
GET myindex/_search
{
"size": 0,
"aggs": {
"clusters": {
"terms": {
"field": "myfield",
"size": 100000
},
"aggs": {
"mydoc": {
"top_hits": {
"size" : 1
}
}
}
}
},
"query": {
"bool": {
"must": [
{
"query_string": { "default_field": "field1", "query": "val1" }
},
{
"query_string": { "default_field": "field2", "query": "val2" }
}
]
}
}
}
Upvotes: 7