user2738183
user2738183

Reputation:

negative-sampling and subsampling

I'm hearing the term "negative-sampling" and "sub sampling" used in conjunction with word2vec a lot.

Before I attempt to mess with word2vec I'm trying to go back through papers which reference word embedding, and start from the beginning. The paper trail has landed me here:

https://gul.gu.se/public/pp/public_courses/course77642/published/1497871737091/resourceId/37659332/content/UploadedResources/lecture10-slides-word2vec_sungmin_VT17.pdf (Google for, "Efficient Estimation of Word Representations in Vector Space" if you don't trust links.)

and states:

enter image description here

(I'm familiar with all bullet points minus the first)

The only stuff I've found on negative-sampling and subsampling has been contained within articles about word2vec, and that's what I'm trying to avoid.

If anyone could explain these terms or point me in the right direction, it would be greatly appreciated :).

Edit: the subsampling tag it's self leads to this definition:

"Subsampling is a resampling procedure akin to the bootstrap in which fewer than all observations are being drawn with replacement (vs. the original sample size used in the textbook bootstrap method). For creating samples out of your existing data, please consider "sampling" tag instead." --- a concrete example of this would be great.

Upvotes: 1

Views: 2072

Answers (1)

user2738183
user2738183

Reputation:

I finally found something for negative sampling, which, if you studied computer science, and know all about "connect the dots" a.k.a graphs, this will be a very helpful link for anyone who wants a concrete example.

https://www.safaribooksonline.com/library/view/mastering-java-for/9781782174271/056ce305-83f2-4efe-993a-b549b7ea3133.xhtml

(or google: "mastering java for data science negative sampling")

For subsampling, I'll be using it for nlp, so this was most relevant:

enter image description here

(taken from https://www.safaribooksonline.com/library/view/python-natural-language/9781787121423/f7035ac3-7624-4b80-b464-64ed8a7f252a.xhtml)

Upvotes: 2

Related Questions