Reputation: 13
Does the Wikipedia python library automatically retrieve the most relevant k documents based on given query? What is the underlying structure of retrieving those documents? Does it use TF-IDF or any other approach?
Upvotes: 0
Views: 72
Reputation: 2599
As you can see from the module's source code, wikipedia
queries the Wikipedia API and returns its results. The order of documents returned is therefore determined by Wikipedia's own CirrusSearch, which is built on Elasticsearch. You can find more information in the Wikipedia API documentation.
Upvotes: 1