Reputation: 1582
I am asked to create two
different parse tree for the following sentence:
foo while bar and baz
Based on these two constructions:
S-> S while S
S-> S and S
The two different trees I have are the following:
Tree A)
S
/ | \
P U S
| /|\
W P U P
| |
W W
Here is the code for A:
import nltk
groucho_grammar = nltk.CFG.fromstring ("""
S -> P U S | P U P
P -> W
U -> 'while' | 'and'
W -> 'foo'|'bar'|'baz'
""")
print(groucho_grammar)
sentence = "foo while bar and baz"
rd_parser = nltk.RecursiveDescentParser(groucho_grammar)
for tree in rd_parser.parse(sentence.split()):
print(tree)
And the result for A:
(S (P (W foo)) (U while) (S (P (W bar)) (U and) (P (W baz))))
Tree B)
S
/ | \
S U P
/ | \ \
P U P W
| |
W W
Now for part B, I just modified the grammar to the following:
groucho_grammar = nltk.CFG.fromstring ("""
S -> S U P | P U P
P -> W
U -> 'while' | 'and'
W -> 'foo'|'bar'|'baz'
""")
But I am getting infinite recursion error:
if isinstance(index, (int, slice)):
RuntimeError: maximum recursion depth exceeded in __instancecheck__
Any help would be appreciated.
Thanks.
Upvotes: 1
Views: 554
Reputation: 1305
Your problem is this rule: S -> S U P | P U P
By allowing S to begin with an instance of S, you allow this infinite recursion:
S -> S U P
S -> (S U P) U P
S -> ((S U P) U P) U P
S -> (((S U P) U P) U P) U P
This is called left recursion, and it is caused by a symbol expanding to itself, in this case S expanding to S.
From the NLTK book, chapter 8:
Recursive descent parsing has three key shortcomings. First, left-recursive productions like NP -> NP PP send it into an infinite loop.
A solution
Luckily, you can simply change the parser you use to one that does not share the left-recursive Achilles heel. Simple change this:
rd_parser = nltk.RecursiveDescentParser(groucho_grammar)
to this:
rd_parser = nltk.parse.chart.BottomUpLeftCornerChartParser(groucho_grammar)
This way you make use of the left-recursive-resistant BottomUpLeftCornerChartParser
Further reading
The left-recursive problem is well-known in automata theory. There are ways to make your grammar non-recursive, as explained in these links:
Upvotes: 2