r1d1
r1d1

Reputation: 469

How I can extract matrixes WI and WO from gensim word2vec?

CBOW word2vec scheme look like this:

enter image description here

How I can extract matrixes WI and WO from gensim.models.word2vec.Word2Vec? I found only these fields in gensim w2v model:

gensim.models.word2vec.Word2Vec.trainables.syn1neg

and

gensim.models.word2vec.Word2Vec.vw.syn1neg.vectors

Can I make an assumption that syn1neg is WI, and WO = vectors - syn1neg?

Why this code

sentences = [['car', 'tree', 'chip2'], ['chip1', 'sugar']]
model = Word2Vec(sentences, min_count=1, size = 5)

give Word2Vec.trainables.syn1neg matrix with zero elements only?

For 30MB dataset Word2Vec.trainables.syn1neg matrix also contain zero elements only, log is here:

gensim log

Upvotes: 3

Views: 613

Answers (1)

gojomo
gojomo

Reputation: 54173

The w2v_model.wv.vectors is what was formerly called "syn0", and serves as the "projection weights" which essentially map a one-hot word-encoding into N dimensions. In your diagram, that's WI.

The w2v_model.trainables.syn1neg is the hidden-to-output weights for negative-sampling mode, what your diagram labels WO.

Upvotes: 1

Related Questions