Reputation: 1
I need to create a search alghoritm on a set of ~90k documents, where new documents get added on a daily basis. I want to marry BM25 with dense vectors and create hybrid ranking.
While I was reviewing documentation of BM25 python implementations, I started to wonder whether it is possible to update a BM25 model that has been once generated with new documents. From the efficiency point of view it won't be the best idea to take the entire database and generate new corpora and BM25 model every time a new document gets uploaded.
I was reviewing different BM25 python implementations, however I did not answer to my question.
EDIT: The exact question is whether it is possible to update a BM25 model that has been previously generated with some new documents, or a new model needs to be trained on the entire database again?
Upvotes: 0
Views: 45