Reputation: 41
Hi I’m using Gensim to find similarity between documents to do so I make TF-IDF of documents and calculate cosine similarity. when I have new document I can calculate similarity of this document with previous documents using index[tfidf[vec]] but in this way TF-IDF doesn’t update and new words does not consider in similarity calculation is there any solution to update TF-IDF quickly without recalculating whole matrix or what is the best solution for my problem?
Upvotes: 2
Views: 383
Reputation: 331
I think it's not possible. Because when you add a new document to the corpus, the vocabulary of TF-IDF will change, and when the vocabulary changes, all of the TF-IDF values will change too and the whole matrix should be recalculated. But this link may be helpful for you.
Upvotes: 0