Sara
Sara

Reputation: 57

new approach of indexing in Elasticsearch

I want to define a new approach of indexing in Elasticsearch so i will edit tf idf method . where to find TF-IDF elasticsearch implementation? what are the packages in elasticsearch source code that i need to manipulate to implement the new approach?

Upvotes: 0

Views: 336

Answers (1)

Val
Val

Reputation: 217304

The TF/IDF similarity algorithm is implemented in Lucene, however, there are ways to define another similarity algorithm to be used inside Elasticsearch via the similarity module. In addition to TF/IDF, there are currently 7 more similarities supported:

  • BM25
  • Classic similarity
  • DFR similarity
  • DFI similarity
  • IB similarity
  • LM Dirichlet similarity
  • LM Jelinek Mercer similarity

Each of them has different parameters that you can tune. Maybe it'd be a good idea to test each of them before venturing into creating your own.

More info about the available Lucene similarity algorithms: https://lucene.apache.org/core/6_5_0/core/org/apache/lucene/search/similarities/Similarity.html

Upvotes: 1

Related Questions