Reputation: 58
I am confused as to how I can use Doc2Vec(using Gensim) for IMDB sentiment classification dataset. I have got the Doc2Vec embeddings after training on my corpus and built my Logistic Regression model using it. How do I use it to make predictions for new reviews? sklearn TF-IDF has a transform method that can be used on test data after training on training data, what is its equivalent in Gensim Doc2Vec?
Upvotes: 0
Views: 409
Reputation: 54173
Have you seen the demo notebook, included with the gensim source code through gensim-3.8.1, which applies Doc2Vec
to the IMDB dataset?
https://github.com/RaRe-Technologies/gensim/blob/3.8.1/docs/notebooks/doc2vec-IMDB.ipynb
Upvotes: 0
Reputation: 2694
To get a vector for an unseen document, use vector = model.infer_vector(["new", "document"])
Then feed vector
into your classifier: preds = clf.predict([vector])
.
Upvotes: 1