Sentiment-ranked nodes in dependency parse with Stanford CoreNLP?

I'd like to perform a dependency parse on a group of sentences and look at the sentiment ratings of individual nodes, as in the Stanford Sentiment Treebank (http://nlp.stanford.edu/sentiment/treebank.html).

I'm new to the CoreNLP API, and after fiddling around I still have no idea how I'd go about getting a dependency parse with ranked nodes. Is this even possible with CoreNLP, and if so, does anyone have experience doing it?

Upvotes: 3

Answers (1)

Nick Zafiridis

Reputation: 155

I modified the code of the inlcuded StanfordCoreNLPDemo.java file, to suit our sentiment needs:

Imports:

import java.io.*;
import java.util.*;

import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations;
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations.PredictedClass;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.util.*;

Initializing the pipeline. Properties include lemma and sentiment:

public class StanfordCoreNlpDemo {

  public static void main(String[] args) throws IOException {
    PrintWriter out;
    if (args.length > 1) {
      out = new PrintWriter(args[1]);
    } else {
      out = new PrintWriter(System.out);
    }
    PrintWriter xmlOut = null;
    if (args.length > 2) {
      xmlOut = new PrintWriter(args[2]);
    }
    Properties props = new Properties();
    props.put("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment");
    props.setProperty("tokenize.options","normalizeCurrency=false");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

Adding the text. These 3 sentences are taken from the live demo of the site you linked. I print the top level annotation's keys as well, to see what you can access from it:

Annotation annotation;
    if (args.length > 0) {
      annotation = new Annotation(IOUtils.slurpFileNoExceptions(args[0]));
    } else {
      annotation = new Annotation("This movie doesn't care about cleverness, wit or any other kind of intelligent humor.Those who find ugly meanings in beautiful things are corrupt without being charming.There are slow and repetitive parts, but it has just enough spice to keep it interesting.");
    }

    pipeline.annotate(annotation);
    pipeline.prettyPrint(annotation, out);
    if (xmlOut != null) {
      pipeline.xmlPrint(annotation, xmlOut);
    }

    // An Annotation is a Map and you can get and use the various analyses individually.
    // For instance, this gets the parse tree of the first sentence in the text.
    out.println();
    // The toString() method on an Annotation just prints the text of the Annotation
    // But you can see what is in it with other methods like toShorterString()
    out.println("The top level annotation's keys: ");
    out.println(annotation.keySet());

For the first sentence, I print its keys and sentiment. Then, I iterate through all its nodes. For each one, i print the leaves of that subtree, which would be the part of the sentence this node is referring to, the name of the node, its sentiment, its node vector(I don't know what that is) and its predictions. Sentiment is an integer, ranging from 0 to 4. 0 is very negative, 1 negative, 2 neutral, 3 positive and 4 very positive. Predictions is a vector of 4 values, each one including a percentage for how likely it is for that node to belong to each of the aforementioned classes. First value is for the very negative class, etc. The highest percentage is the node's sentiment. Not all nodes of the annotated tree have sentiment. It seems that each word of the sentence has two nodes in the tree. You would expect words to be leaves, but they have a single child, which is a node with a label lacking the prediction annotation in its keys. The node's name is the same word. That is why I check for the prediction annotation before I call the function, which fetches it. The correct way to do this, however, would be to ignore the null pointer exception thrown, but I chose to elaborate, to make the reader of this answer understand that no information regarding sentiment is missing.

List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
if (sentences != null && sentences.size() > 0) {
  ArrayCoreMap sentence = (ArrayCoreMap) sentences.get(0);


  out.println("Sentence's keys: ");
  out.println(sentence.keySet());

  Tree tree2 = sentence.get(SentimentCoreAnnotations.AnnotatedTree.class);
  out.println("Sentiment class name:");
  out.println(sentence.get(SentimentCoreAnnotations.ClassName.class));

  Iterator<Tree> it = tree2.iterator();
  while(it.hasNext()){
      Tree t = it.next();
      out.println(t.yield());
      out.println("nodestring:");
      out.println(t.nodeString());
      if(((CoreLabel) t.label()).containsKey(PredictedClass.class)){
          out.println("Predicted Class: "+RNNCoreAnnotations.getPredictedClass(t));
      }
      out.println(RNNCoreAnnotations.getNodeVector(t));
      out.println(RNNCoreAnnotations.getPredictions(t));
  }

Lastly, some more output. Dependencies are printed. Dependencies here could be also accessed by accessors of the parse tree(tree or tree2):

      out.println("The first sentence is:");
      Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
      out.println();
      out.println("The first sentence tokens are:");
      for (CoreMap token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
        ArrayCoreMap aToken = (ArrayCoreMap) token;
        out.println(aToken.keySet());
        out.println(token.get(CoreAnnotations.LemmaAnnotation.class));
      }
      out.println("The first sentence parse tree is:");
      tree.pennPrint(out);
      tree2.pennPrint(out);
      out.println("The first sentence basic dependencies are:"); 
      out.println(sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class).toString(SemanticGraph.OutputFormat.LIST));
      out.println("The first sentence collapsed, CC-processed dependencies are:");
      SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
      out.println(graph.toString(SemanticGraph.OutputFormat.LIST));
    }
  }

}

Upvotes: 6

Sentiment-ranked nodes in dependency parse with Stanford CoreNLP?

Answers (1)

Related Questions