Elliot Gorokhovsky
Elliot Gorokhovsky

Reputation: 3752

pyparsing not parsing the whole string

I have the following grammar and test case:

from pyparsing import Word, nums, Forward, Suppress, OneOrMore, Group

#A grammar for a simple class of regular expressions
number = Word(nums)('number')
lparen = Suppress('(')
rparen = Suppress(')')

expression = Forward()('expression')

concatenation = Group(expression + expression)
concatenation.setResultsName('concatenation')

disjunction = Group(lparen + OneOrMore(expression + Suppress('|')) + expression + rparen)
disjunction.setResultsName('disjunction')

kleene = Group(lparen + expression + rparen + '*')
kleene.setResultsName('kleene')

expression << (number | disjunction | kleene | concatenation)

#Test a simple input
tests = """
(8)*((3|2)|2)
""".splitlines()[1:]

for t in tests:
    print t
    print expression.parseString(t)
    print

The result should be

[['8', '*'],[['3', '2'], '2']]

but instead, I only get

[['8', '*']]

How do I get pyparsing to parse the whole string?

Upvotes: 5

Views: 943

Answers (2)

PaulMcG
PaulMcG

Reputation: 63709

Your concatenation expression is not doing what you want, and comes close to being left-recursive (fortunately it is the last term in your expression). Your grammar works if you instead do:

expression << OneOrMore(number | disjunction | kleene)

With this change, I get this result:

[['8', '*'], [['3', '2'], '2']]

EDIT: You an also avoid the precedence of << over | if you use the <<= operator instead:

expression <<= OneOrMore(number | disjunction | kleene)

Upvotes: 3

halloleo
halloleo

Reputation: 10354

parseString has a parameter parseAll. If you call parseString with parseAll=True you will get error messages if your grammar does not parse the whole string. Go from there!

Upvotes: 6

Related Questions