Reputation: 41
I am trying to get flat tree from the tree structure like the one given below.
I want to get this whole tree in a string like without Bad tree detected error:
( (S (NP-SBJ (NP (DT The) (JJ high) (JJ seven-day) )(PP (IN of) (NP (DT the) (CD 400) (NNS money) )))(VP (VBD was) (NP-PRD (CD 8.12) (NN %) )(, ,) (ADVP (RB down) (PP (IN from) (NP (CD 8.14) (NN %) ))))(. .) ))
Upvotes: 1
Views: 7932
Reputation: 563
NLTK provides functionality to do this right away:
flat_tree = tree._pformat_flat("", "()", False)
tree.pprint()
and str(tree)
both would call this method internally, but adding extra logic to split it into multiple lines if needed.
Upvotes: 1
Reputation: 106
You can convert the tree into string using str function then split and join as follow:
parse_string = ' '.join(str(tree).split())
print parse_string
Upvotes: 5
Reputation: 41
Python nltk provide a function for tree manipulation and node extraction
from nltk.tree import Tree
for tr in trees:
tr1 = str(tr)
s1 = Tree.fromstring(tr1)
s2 = s1.productions()
Upvotes: 3
Reputation: 170
The documentation provides a pprint()
method that flattens the tree into one line.
Parsing this sentence:
string = "My name is Ross and I am cool. What's going on world? I'm looking for friends."
And then calling pprint()
yields the following:
u"(NP+SBAR+S\n (S\n (NP (PRP$ my) (NN name))\n (VP\n (VBZ is)\n (NP (NNP Ross) (CC and) (PRP I) (JJ am) (NN cool.))\n (SBAR\n (WHNP (WP What))\n (S+VP (VBZ 's) (VBG going) (NP (IN on) (NN world)))))\n (. ?))\n (S\n (NP (PRP I))\n (VP (VBP 'm) (VBG looking) (PP (IN for) (NP (NNS friends))))\n (. .)))"
From this point, if you wish to remove the tabs and newlines, you can use the following split
and join
(see here):
splitted = tree.pprint().split()
flat_tree = ' '.join(splitted)
Executing that yields this for me:
u"(NP+SBAR+S (S (NP (PRP$ my) (NN name)) (VP (VBZ is) (NP (NNP Ross) (CC and) (PRP I) (JJ am) (NN cool.)) (SBAR (WHNP (WP What)) (S+VP (VBZ 's) (VBG going) (NP (IN on) (NN world))))) (. ?)) (S (NP (PRP I)) (VP (VBP 'm) (VBG looking) (PP (IN for) (NP (NNS friends)))) (. .)))"
Upvotes: 2