V Y
V Y

Reputation: 685

How to get word vectors from a gensim Doc2Vec?

I trained a gensim.models.doc2vec.Doc2Vec model
d2v_model = Doc2Vec(sentences, size=100, window=8, min_count=5, workers=4) and I can get document vectors by docvec = d2v_model.docvecs[0]

How can I get word vectors from trained model ?

Upvotes: 4

Views: 9314

Answers (2)

lugq
lugq

Reputation: 71

If you want to get all the trained doc vectors, you can easily use model.docvecs.doctag_syn0. If you want to get the indexed doc, you can use model.docvecs[i]. If you are training a Word2Vec model, you can get model.wv.syn0. If you want to get more, check this github issue link: (https://github.com/RaRe-Technologies/gensim/issues/1513)

Upvotes: 0

gojomo
gojomo

Reputation: 54173

Doc2Vec inherits from Word2Vec, and thus you can access word vectors the same as in Word2Vec, directly by indexing the model:

wv = d2v_model['apple']

Note, however, that a Doc2Vec training mode like pure DBOW (dm=0) doesn't need or create word vectors. (Pure DBOW still works pretty well and fast for many purposes!) If you do access word vectors from such a model, they'll just be the automatic randomly-initialized vectors, with no meaning.

Only when the Doc2Vec mode itself co-trains word-vectors, as in the DM mode (default dm=1) or when adding optional word-training to DBOW (dm=0, dbow_words=1), are word-vectors and doc-vectors both learned simultaneously.

Upvotes: 14

Related Questions