Reputation: 2162

Stanford NLP sentiment training set

is there a problem with the original (movie reviews) training set provided by Stanford?

Looking at it, it seems that the words "no" and "not" are always marked as negative and the word "n't" is always marked as neutral. Moreover, words with 2 meanings are also always consistent. One would expect the word "like" to be positive in a phrase such as "I like you" and neutral in a phrase such as "A is like B".

Does anyone know why this is the case?

Upvotes: 0

Answers (1)

Christopher Manning

Reputation: 9450

"Problem" is a relative term. There's not something really wrong, but you could provide arguments for doing things differently.

tl;dr

Annotation was indeed done under the model that one subtree of words (including the limiting case of a single word) always gets the same rating.

The idea here is the principle of compositionality of language: if you want to work out the meaning of a novel large sentence, it's generally accepted that you should work out the meaning of the parts and then work out what happens when those parts are combined. The ratings are doing that for the case of sentiment.

In contrast it's not quite obvious what you'd be doing if you were assigning sentiment to a substring in context. Like if the substring was "[a little bit]" what does it mean to say that you're evaluating it in a context like "The movie was [a little bit] original" or "The movie was [a little bit] boring". Are you evaluating the sentiment of "a little bit" or are you just looking at the context and sticking on the substring a rating which really reflects the sentiment of "original" or "boring"?

Nevertheless, one can still raise questions about the approach. For one thing, there is no use of word senses. One substring gets one rating. As another, it could be argued that sentiment is a kind of gestalt, and even though words have a meaning and larger phrase meanings are calculated compositionally from them, it doesn't really make sense to say that words have a sentiment absent of their use in a particular context. That is, "thin" has a clear meaning, and working from that using your world knowledge, it makes sense that a "thin laptop" is a good thing and "thin walls" are a bad thing, but it doesn't seem like by itself "thin" has a sentiment - it arises as a result of whether the object it refers to is deemed good if thin. Hopefully, in such cases, AMT annotators gave "thin" by itself neutral sentiment, and only gave positive and negative ratings to phrases like "thin laptop" and "thin walls". But, in practice, their mind could easily have been conjuring up a particular context and they judged the word relative to that context.

p.s. This question really seems more Linguistics Stack Exchange than Stack Overflow.

Upvotes: 1

Stanford NLP sentiment training set

Answers (1)

Related Questions