zhuj9
zhuj9

Reputation: 73

how to differentiate sentences with antonyms using word2vec

Say I have two sentences, which are similar except there is only one different word with opposite meaning. e.g. "I like her" vs. "I hate her". word2vec is used in my classification project. As far as I know, word2vec seems unable to figure out differences between antonyms. Is there any way to solve this?

Upvotes: 2

Views: 410

Answers (1)

gojomo
gojomo

Reputation: 54183

Unfortunately, what we consider 'antonyms' are usually quite similar in word2vec coordinate spaces. That's because such words are quite similar in almost all respects – except for the one contrast they emphasize.

And further, to the extent those contrasts may be captured by the word2vec orientations, they will be in many varied directions. The 'hot'-vs-'cold' contrast will be different from the 'light'-vs-'dark' and the 'small'-vs-'big'.

There might be some analytic technique on sets of word-vectors that helps discover antonymic directions/pairs, but I haven't noticed one discussed, especially not anything that's simple/intuitive or applicable to general word-vector sets. (Once you do know words are opposites, as when consulting prior labeled lexicons or analogy questions, then the directions-between-their-word-vectors can be useful in other analysis, like discovering other words that contrast-in-the-same-way, as when solving analogy problems.)

Can you be more specific about your ultimate goal, with more example of the kinds of input you'll have and what specific results you want software to report?

The one example you give, "I like her" vs "I hate her", could be more generally seen as a sentiment classification, and word2vec-powered classifiers can do OK (though far from perfect) on such challenges. That is, with enough labeled training data, a classifier with a lot of examples of "positive" and "negative" texts will tend to learn that 'like' (and similar words) are positive and 'hate' (and similar) are negative, and do OK on other variants of positive/negative statements (excepting more complex constructions, like negations, subtle qualifications, understatement, irony, etc.)

So more info on what exactly you hope to detect/report, and what you've tried and found insufficient, might generate more ideas on how to achieve it.

Upvotes: 1

Related Questions