Saving output (context) embeddings in word2vec (gensim implementation) as a final model

Question

I have studied word2vec implementation in gensim, I am aware that input vectors are in syn0, output vectors are in syn1 and syn1neg if negative sampling.

I know I can access similarity between input and output embeddings like this:

outv = KeyedVectors()
outv.vocab = model.wv.vocab
outv.index2word = model.wv.index2word  
outv.syn0 = model.syn1neg 
inout_similars = outv.most_similar(positive=[model['cousin']])

My question is, if it is possible to save output embeddings (from syn1 or syn1neg matrix) as final model. For example, when model.save(), so that it outputs output embeddings (or where exactly in the code of word2vec.py I could access and modify that). I need this in order to use these output embeddings as input to classifier. I have done it previously in brute-force approach, so I would like to access output embeddings easily.

gojomo · Accepted Answer

Your object outv, as an instance of KeyedVectors, has its own save() method (inherited from the SaveLoad superclass defined in gensim/utils.py) and save_word2vec_format() method. Each would save them in a manner you could reload into Python code again later.

Saving output (context) embeddings in word2vec (gensim implementation) as a final model

Answers (1)

Related Questions