Reputation: 501
I want to find some nodes in the Stanford dependency parser, for example:
Sentence: Microsoft ad says that Macs are too cool for its customers.
Dependencies:
- compound(ad-2, Microsoft-1)
- nsubj(says-3, ad-2)
- root(ROOT-0, says-3)
- mark(cool-8, that-4)
- nsubj(cool-8, Macs-5)
- cop(cool-8, are-6)
- advmod(cool-8, too-7)
- ccomp(says-3, cool-8)
- case(customers-11, for-9)
- nmod:poss(customers-11, its-10)
- nmod:for(cool-8, customers-11)
I'd like to capture the following constructs:
p1={Node with two outgoing edges with labels "nsubj" and "ccomp"},
In its dependency tree, `says` satisfies this condition, so p1={says}
and
s1={ n1={Node that connected to the p1 by an edge with label "nsubj"},
Node connected to n1 by an edge with label "nn" or "quantmod"}
In its dependency tree s1={n1=ad, Microsoft}
I don't know how can I extract these nodes, I tried this structure for extracting ad, but it extracts Macs too!. I have no idea for extracting other nodes! Any help would be greatly appreciated.
typedDependency.reln().getShortName().equals("nsubj")
Here is my code:
Tree tree = sentence.get(TreeAnnotation.class);
// Get dependency tree
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(tree);
Collection<TypedDependency> td = gs.typedDependenciesCollapsed();
System.out.println(td);
Object[] list = td.toArray();
System.out.println(list.length);
TypedDependency typedDependency;
for (Object object : list) {
typedDependency = (TypedDependency) object;
System.out.println("Depdency Name "+typedDependency.dep().toString()+ " :: "+ "Node "+typedDependency.reln());
if (typedDependency.reln().getShortName().equals("nsubj")) {
????
}
}
}
}
}
Upvotes: 2
Views: 279
Reputation: 3355
Each typed dependency connects a dependent and a head. For the first construct, you need to iterate over the typed dependencies and record those that have labels "nsubj" and "ccomp" as well as the ids of their heads. The id of the head of a typed dependency is accessed as follows:
typedDependency.dep().index()
Then just check which pairs of nsubj and ccomp point to the same head. In your example, one head will correspond to "say".
For the second construct, you can also use the ids of the heads in typed dependencies to track the connections.
Upvotes: 0
Reputation: 8739
Have you reviewed the slides on Semgrex?
They are available here:
http://nlp.stanford.edu/software/Semgrex.ppt
Some more info on Semgrex:
http://nlp.stanford.edu/software/tregex.shtml
Upvotes: 1