Reputation: 31
I am trying to recursively parse an expression. I followed a few tutorials, and it seems like Forward() is the class that I need. However, something seemingly simple is causing me trouble.
Here is the code I wrote
from pyparsing import *
exp = Forward()
integer = Word(nums)
exp << (integer | (exp + '+' + exp))
input = "1+1"
print exp.parseString(input)
I want it to return ['1','+','1']
but it only returns ['1']
Help is much appreciated.
Upvotes: 3
Views: 1433
Reputation: 4742
I recently came to this problem. Now the left recursion is supported on PyParsing 3.0.0b2 or later. But the feature should be explicitly enabled, and the order of operands of the |
operator should be adjusted like this.
from pyparsing import ParserElement, Forward, Word, nums
ParserElement.enable_left_recursion()
exp = Forward()
integer = Word(nums)
exp << ((exp + '+' + exp) | integer)
input = "1+1"
print(exp.parseString(input))
This will output the following.
['1', '+', '1']
Upvotes: 1
Reputation: 63782
There are several issues you have here. In ascending order of importance:
exp.parseString(input, parseAll=True)
exp + exp
first, ahead of integer), you will find yourself blowing up the maximum recursion depth. That is because this parser has left-recursive definition of exp
. That is, to parse an exp
, it has to parse an exp
, for which it has to parse an exp
, etc. In general, many published BNFs use recursion to describe this kind of repetitive structure, but pyparsing does not do the necessary lookahead/backtrack for that to work. Try exp <<= integer + ZeroOrMore('+' + integer) | '(' + exp + ')'
- this expression is not left-recursive, since you have to get past an opening parenthesis before parsing a nested exp
.EDIT: Sorry, I was a little too quick on my earlier suggestion, here is the proper way to do your recursive expression parsing:
from pyparsing import *
exp = Forward()
LPAR, RPAR = map(Suppress, "()")
integer = Word(nums)
term = integer | Group(LPAR + exp + RPAR)
exp << term + ZeroOrMore('+' + term)
input = "(1+1) + 1"
print(exp.parseString(input))
prints
[['1', '+', '1'], '+', '1']
If you trace through the code, you'll see the recursion: exp
is defined using term
, and term
is defined using exp
. The fourFn.py
example is closest to this style; since writing that, I've added the infixNotation
method to pyparsing, which would allow you to write:
exp = infixNotation(integer, [
('+', 2, opAssoc.LEFT),
])
infixNotation
takes care of the recursive definitions internally, implicitly defines the '(' + exp + ')'
expression, and makes it easy to implement a system of operators with precedence of operations.
Upvotes: 1