Reputation: 2163
I've chunked a sentence using:
grammar = '''
NP:
{<DT>*(<NN.*>|<JJ.*>)*<NN.*>}
NVN:
{<NP><VB.*><NP>}
'''
chunker = nltk.chunk.RegexpParser(grammar)
tree = chunker.parse(tagged)
print tree
The result looks like:
(S
(NVN
(NP The_Pigs/NNS)
are/VBP
(NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))
that/WDT
formed/VBN
in/IN
1977/CD
./.)
But now I'm stuck trying to figure out how to navigate that. I want to be able to find the NVN subtree, and access the left-side noun phrase ("The_Pigs"), the verb ("are") and the right-side noun phrase ("a Bristol-based punk rock band"). How do I do that?
Upvotes: 25
Views: 27972
Reputation: 823
Try this:
for a in tree:
if type(a) is nltk.Tree:
if a.node == 'NVN': # This climbs into your NVN tree
for b in a:
if type(b) is nltk.Tree and b.node == 'NP':
print b.leaves() # This outputs your "NP"
else:
print b # This outputs your "VB.*"
It outputs this:
[('The_Pigs', 'NNS')]
('are', 'VBP')
[('a', 'DT'), ('Bristol-based', 'JJ'), ('punk', 'NN'), ('rock', 'NN'), ('band', 'NN')]
Upvotes: 5
Reputation: 1164
here's a code sample for generating all the subtrees with a label 'NP'
def filt(x):
return x.label()=='NP'
for subtree in t.subtrees(filter = filt): # Generate all subtrees
print subtree
for siblings, you might wanna take a look at the method ParentedTree.left_siblings()
for more details, here are some useful links.
http://www.nltk.org/howto/tree.html #some basic usage and example http://nbviewer.ipython.org/github/gmonce/nltk_parsing/blob/master/1.%20NLTK%20Syntax%20Trees.ipynb #a notebook playwith these methods
http://www.nltk.org/_modules/nltk/tree.html #all api with source
Upvotes: 7
Reputation: 3156
Try:
ROOT = 'ROOT'
tree = ...
def getNodes(parent):
for node in parent:
if type(node) is nltk.Tree:
if node.label() == ROOT:
print "======== Sentence ========="
print "Sentence:", " ".join(node.leaves())
else:
print "Label:", node.label()
print "Leaves:", node.leaves()
getNodes(node)
else:
print "Word:", node
getNodes(tree)
Upvotes: 19
Reputation: 610
You could, of course, write your own depth first search... but there is an easier (better) way. If you want every subtree rooted at NVM, use Tree's subtree method with the filter parameter defined.
>>> print t
(S
(NVN
(NP The_Pigs/NNS)
are/VBP
(NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))
that/WDT
formed/VBN
in/IN
1977/CD
./.)
>>> for i in t.subtrees(filter=lambda x: x.node == 'NVN'):
... print i
...
(NVN
(NP The_Pigs/NNS)
are/VBP
(NP a/DT Bristol-based/JJ punk/NN rock/NN band/NN))
Upvotes: 9