How I can extract matrixes WI and WO from gensim word2vec?

Question

CBOW word2vec scheme look like this:

How I can extract matrixes WI and WO from gensim.models.word2vec.Word2Vec? I found only these fields in gensim w2v model:

gensim.models.word2vec.Word2Vec.trainables.syn1neg

and

gensim.models.word2vec.Word2Vec.vw.syn1neg.vectors

Can I make an assumption that syn1neg is WI, and WO = vectors - syn1neg?

Why this code

sentences = [['car', 'tree', 'chip2'], ['chip1', 'sugar']]
model = Word2Vec(sentences, min_count=1, size = 5)

give Word2Vec.trainables.syn1neg matrix with zero elements only?

For 30MB dataset Word2Vec.trainables.syn1neg matrix also contain zero elements only, log is here:

gensim log

gojomo · Accepted Answer

The w2v_model.wv.vectors is what was formerly called "syn0", and serves as the "projection weights" which essentially map a one-hot word-encoding into N dimensions. In your diagram, that's WI.

The w2v_model.trainables.syn1neg is the hidden-to-output weights for negative-sampling mode, what your diagram labels WO.

How I can extract matrixes WI and WO from gensim word2vec?

Answers (1)

Related Questions