Reputation: 27
I am a computer science student, now doing an NLP project. I have done a programme to convert a given input sentence in to dependency structure representation using the following code
private void nextActionPerformed(java.awt.event.ActionEvent evt) {
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, parse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props, false);
String text = input.getText();
Annotation document = pipeline.process(text);
for(CoreMap sentence: document.get(SentencesAnnotation.class))
{
SemanticGraph dependencies = sentence.get(CollapsedDependenciesAnnotation.class);
System.out.println(dependencies);
}
}
I am getting output for the given example sentence "A cat is sitting on the table" as shown in figure-> sitting/VBG (root) -> cat/NN (nsubj) -> A/DT (det) -> is/VBZ (aux) -> table/NN (nmod:on) -> on/IN (case) -> the/DT (det) Now what I want is to retrieve major semantic elements from the given dependency representation. For example, in the given sentence i want to retrieve sitting, cat and table . That is for a general simple sentence, i want to retrieve the root word, subject and object.Anybody please help with example codes.
Upvotes: 1
Views: 980
Reputation: 5749
For simple cases, you can define Semgrex patterns over the dependency graph. For instance, to extract subject/verb/object triples you could use the code below:
SemgrexPattern pattern = SemgrexPattern.compile("{$}=root >/.subj(pass)?/ {}=subject >/.obj/ {}=object");
SemgrexMatcher matcher = pattern.matcher(new Sentence("A cat is sitting on the table").dependencyGraph());
while (matcher.find()) {
IndexedWord root = matcher.getNode("root");
IndexedWord subject = matcher.getNode("subject");
IndexedWord object = matcher.getNode("object");
System.err.println(root.word() + "(" + subject.word() + ", " + object.word());
}
Note that even your example isn't in this simple case though. You have an nmod:on
edge rather than a dobj
edge between sitting and table. As things get more complex, it might be worthwhile to take the Stanford OpenIE output at face value:
new Sentence("A cat is sitting on the table").openieTriples()
.forEach(System.err::println);
This will get you a triple (cat; is sitting on; table)
, and perhaps it's easier to post-process this into (cat; sitting; table)
or whatever your actual downstream application is.
Upvotes: 1