Satarupa Guha
Satarupa Guha

Reputation: 1307

How to get phrase-level sentiment from Stanford Core NLP package

This might not be a very relevant question to this community. But I thought it would let me reach out to the wider computer science community and get help.

I am using the Stanford Core NLP package, more specifically the Sentiment module of it. I am getting sentence level sentiment by using the following command.

java -cp stanford-corenlp-3.4.jar:stanford-corenlp-3.4-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-0.23.jar -mx2g edu.stanford.nlp.sentiment.SentimentPipeline -stdin < input.txt

But I need the phrase-level sentiment, like we see in the online demo. I am not being able to figure out how.

EDIT:

After looking into the source code, I figured that just by adding another argument to the above-mentioned command, it is possible to get sentiment score for each node of the parse tree representation of a sentence. However, this gives only a numeric sentiment score as opposed to a positive/negative sentiment. But I think it is fairly trivial to translate this score to a binary positive/negative sentiment. The command is:

java -cp stanford-corenlp-3.4.jar:stanford-corenlp-3.4-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-0.23.jar -mx2g edu.stanford.nlp.sentiment.SentimentPipeline -stdin -output PENNTREES < input.txt

Upvotes: 4

Views: 1940

Answers (1)

saganas
saganas

Reputation: 565

You could use BuildBinarizedDataset (stanford-corenlp 3.4) as an example how to parse the sentence into PTB tree with sentiment annotations. Curently BuildBinarizedDataset takes an input like:

0   I hate demo
2   I
1   hate
2   demo
0   I hate

Where first line is a sentence and the following one the sentiment for the sentence, but this is used for the training of the model to produce sentiment annotated PTB trees and not to give the value of separate phrases.

(0 (2 I) (0 (1 hate) (2 demo)))

However, if you provide only the sentence it will produce a tree with the overall sentence sentiment value:

(0 (0 I) (0 (0 hate) (0 demo)))

Probably you could modify the code of BuildBinarizedDataset instead of assigning labels from defined values to evaluation with Sentiment Annotation Pipeline.

Hope this points you to the right direction. If you figure out how to do it, please share.

Upvotes: 4

Related Questions