Reputation: 469
CBOW word2vec scheme look like this:
How I can extract matrixes WI and WO from gensim.models.word2vec.Word2Vec
?
I found only these fields in gensim w2v model:
gensim.models.word2vec.Word2Vec.trainables.syn1neg
and
gensim.models.word2vec.Word2Vec.vw.syn1neg.vectors
Can I make an assumption that syn1neg
is WI, and WO = vectors
- syn1neg
?
Why this code
sentences = [['car', 'tree', 'chip2'], ['chip1', 'sugar']]
model = Word2Vec(sentences, min_count=1, size = 5)
give Word2Vec.trainables.syn1neg
matrix with zero elements only?
For 30MB dataset Word2Vec.trainables.syn1neg
matrix also contain zero elements only, log is here:
Upvotes: 3
Views: 613
Reputation: 54173
The w2v_model.wv.vectors
is what was formerly called "syn0", and serves as the "projection weights" which essentially map a one-hot word-encoding into N dimensions. In your diagram, that's WI.
The w2v_model.trainables.syn1neg
is the hidden-to-output weights for negative-sampling mode, what your diagram labels WO.
Upvotes: 1