Reputation: 972
I was actually wondering, how can we validate or evaluate empirically the values of b and k1 in the BM25 formulas? in other terms what is the most 'scientific' way to evaluate it?
Is there any research paper that we can refer to in order to see how this types of evaluations is done?
Upvotes: 2
Views: 6479
Reputation: 1351
The optimal value of these BM25 parameters are very dependent on your data collection. Read this: Pluggable Similarity Algorithms | Elasticsearch
A simple way of tuning the parameters is to adjust them and then evaluate their performance impact. If the results are not satisfying, change parameters again and evaluate the results. It can be automated with metaheuristic algorithms like Genetic or ACO.
Some papers are also available:
Upvotes: 4