shuttle87
shuttle87

Reputation: 15934

Matching parenthesis with OneOrMore

In an application I'm working on we have a DSL that groups some characters together, they can either be grouped or not. Parenthesis determines the groups. For example, good input:

123
12(34)
1(234)

Bad input:

12(34

Essentially I want any mismatched parenthesis to not parse at all as this should be a syntax error for the purposes of what I am doing. I made this MVCE to show the issue I'm having with my pyparsing code:

import pyparsing as pp

def Syntax():
    lpar = pp.Literal('(').suppress()
    rpar = pp.Literal(')').suppress()
    rank = pp.Word('12345678', exact=1) #card ranking
    ranks = pp.OneOrMore(rank)
    rank_grouping = pp.Group(lpar + ranks + rpar)
    atom = ranks | rank_grouping
    return pp.OneOrMore(atom)

mvce_parser = Syntax()
try:
    mvce_parser.parseString("(76)(54")
except pp.ParseException:
    print("Exception1 was thrown!")
else:
    print("Exception1 not thrown :(")

try:
    mvce_parser.parseString("(76(54)")
except pp.ParseException:
    print("Exception2 was thrown!")
else:
    print("Exception2 not thrown :(")

Output:

$ python test.py 
Exception1 not thrown :(
Exception2 was thrown!

The issue I have here is that the first example string (76)(54 here parses and returns [['7','6']] but doesn't throw the ParseException that I want. The second one does however fail as expected.

I'm suspecting that this is a result of OneOrMore suppressing the exception from the remaining part then returning what it has so far.

How can I change my code to avoid this inconsistency problem? Even though OneOrMore is convenient is there some other way of doing this without using OneOrMore?

Upvotes: 2

Views: 77

Answers (1)

Konstantin
Konstantin

Reputation: 25359

So you want to parse entire string.

To do this you either need to explicitly add StringEnd() to your grammar

return pp.OneOrMore(atom) + pp.FollowedBy(pp.StringEnd())

or supply the parseString call with the parseAll = True parameter

mvce_parser.parseString("(76)(54", parseAll=True)

Upvotes: 3

Related Questions