dataguy
dataguy

Reputation: 58

Sentiment Classification using Doc2Vec

I am confused as to how I can use Doc2Vec(using Gensim) for IMDB sentiment classification dataset. I have got the Doc2Vec embeddings after training on my corpus and built my Logistic Regression model using it. How do I use it to make predictions for new reviews? sklearn TF-IDF has a transform method that can be used on test data after training on training data, what is its equivalent in Gensim Doc2Vec?

Upvotes: 0

Views: 409

Answers (2)

gojomo
gojomo

Reputation: 54173

Have you seen the demo notebook, included with the gensim source code through gensim-3.8.1, which applies Doc2Vec to the IMDB dataset?

https://github.com/RaRe-Technologies/gensim/blob/3.8.1/docs/notebooks/doc2vec-IMDB.ipynb

Upvotes: 0

chefhose
chefhose

Reputation: 2694

To get a vector for an unseen document, use vector = model.infer_vector(["new", "document"]) Then feed vectorinto your classifier: preds = clf.predict([vector]).

Upvotes: 1

Related Questions