Reputation: 469
For Skip-gram word2vec training samples are obtained as follows:
Sentence: The fox was running across the maple forest
The word fox
give next pairs for training:
fox-run, fox-across, fox-maple, fox-forest
and etc. for every word. CBOW w2v use reverse approach:
run-fox, across-fox, maple-fox, forest-fox
or for forest
word:
fox-forest, run-forest, across-forest, maple-forest
So we get all the pairs. What's the difference between Skip-gram word2vec and CBOW w2v during training with gensim library, if we do not specify the target word when training in the CBOW-mode? In both cases all pairs of words are used, or not?
Upvotes: 2
Views: 1910
Reputation: 54153
Only skip-gram uses training pairs of the form (context_word)->(target_word)
.
In CBOW, the training examples are (average_of_multiple_context_words)->(target_word)
. So, when the error from a single training example is backpropagated, multiple context-words get the same corrective nudge.
Upvotes: 7