Reputation: 13
I have studied word2vec
implementation in gensim, I am aware that input vectors are in syn0
, output vectors are in syn1
and syn1neg
if negative sampling.
I know I can access similarity between input and output embeddings like this:
outv = KeyedVectors()
outv.vocab = model.wv.vocab
outv.index2word = model.wv.index2word
outv.syn0 = model.syn1neg
inout_similars = outv.most_similar(positive=[model['cousin']])
My question is, if it is possible to save output embeddings (from syn1
or syn1neg
matrix) as final model. For example, when model.save()
, so that it outputs output embeddings (or where exactly in the code of word2vec.py
I could access and modify that). I need this in order to use these output embeddings as input to classifier. I have done it previously in brute-force approach, so I would like to access output embeddings easily.
Upvotes: 1
Views: 1606
Reputation: 54183
Your object outv
, as an instance of KeyedVectors
, has its own save()
method (inherited from the SaveLoad
superclass defined in gensim/utils.py
) and save_word2vec_format()
method. Each would save them in a manner you could reload into Python code again later.
Upvotes: 2