CyberPlayerOne
CyberPlayerOne

Reputation: 3180

Is skip-gram model in word2vec an expanded version of N-Gram model? skip-gram vs. skip-grams?

The skip-gram model of word2vec uses a shallow neural network to learn the word embedding with (input-word, context-word) data. When I read the tutorials for the skip-gram model there was not any mentioning regarding the N-gram. However I came across several online discussions in which people claim --- skip-gram model in word2vec is an expanded version of N-Gram model. Also I don't really understand this "k-skip-n-gram" in the following Wikipedia page.

Wikipedia cited a paper from 1992 for "skip-grams", so I guess this is not the word2vec's skip-gram model, right? Another paper regarding this "skip-grams" is https://homepages.inf.ed.ac.uk/ballison/pdf/lrec_skipgrams.pdf. This is very confusing. Why there's no one clear this up.

The wikipedia source and the online discussion are as follows:

Upvotes: 1

Views: 658

Answers (1)

Denis Gordeev
Denis Gordeev

Reputation: 464

I agree that naming is somewhat tricky here. You may check a tutorial here.

https://www.kdnuggets.com/2018/04/implementing-deep-learning-methods-feature-engineering-text-data-skip-gram.html

So in word2vec in its simplest skip-gram variant we may present the whole corpus as many pairs consisting of the target word and the output word that we want to predict with our neural network. So the sentence "the quick brown fox jumps over the lazy dog" and the word "brown" with the window-4 of our word2vec model may be presented as: (target_word, word_to_predict) (brown, quick) (brown, the) (brown, fox) (brown, jumps) Then we move to the next word "fox" and so on. Thus, we use skip-grams to train our neural network. I haven't seen "k-skip-n-gram" before, but as far as I understand we got 4-skip-bigrams

Upvotes: 2

Related Questions