Reputation: 2089
I'm looking for a setup that actually returns the top 10% of results of a certain query. After the result we also want to sort the subset.
Is there an easy way to do this?
Can anyone provide a simple example for this. I was thinking scaling the results scores between 0 and 1.0 and basically sepcifiying min_score to 0.9.
I was trying to create function_score queries but those seem a bit complex for a simple requirement such as this one, plus I was not sure how sorting would effect the results, since I want the sort functions work always on the 10% most relevant articles of course.
Thanks, Peter
Upvotes: 1
Views: 93
Reputation: 8572
As you want to slice response in % of overall docs count, you need to know that anyway. And using from / size
params will cut off the required amount at query time.
Assuming this, seems that easiest way to achieve your goal is to make 2 queries:
search_type=count
to get overall document count.{"from": 0, "size": count/10}
with count got from 1st response.Talking about tweaking the scoring. For me, it seems as bad idea, as getting multiple documents with the same score is pretty generic situation. So, cutting dataset by min_score
will probably result in skewed data.
Upvotes: 1