Reputation: 63139
All my documents have a uid
field with an ID that links the document to a user. There are multiple documents with the same uid
.
I want to perform a search over all the documents returning only the highest scoring document per unique uid
.
The query selecting the relevant documents is a simple multi_match
query.
Upvotes: 19
Views: 19885
Reputation: 3183
In ElasticSearch 5.3 they added support for field collapsing. You should be able to do something like:
GET /_search
{
"query": {
"multi_match" : {
"query": "this is a test",
"fields": [ "subject", "message", "uid" ]
}
},
"collapse" : {
"field" : "uid"
},
"size": 20,
"from": 100
}
The benefit of using field collapsing instead of a top hits aggregation is that you can use pagination with field collapsing.
Upvotes: 18
Reputation: 52368
You need a top_hits
aggregation.
And for your specific case:
{
"query": {
"multi_match": {
...
}
},
"aggs": {
"top-uids": {
"terms": {
"field": "uid"
},
"aggs": {
"top_uids_hits": {
"top_hits": {
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
The query above does perform your multi_match
query and aggregates the results based on uid
. For each uid bucket it returns only one result, but after all the documents in the bucket were sorted based on _score
in descendant order.
Upvotes: 23