Reputation: 13
I'm having trouble getting myself to properly parse a recursive grammar using pyparsing. Test #5 in the code below fails, despite my thinking that it would recognize it as three matches of the "param" parser (two of which are nested under one "parent"):
import pyparsing as p
DOUBLE_QUOTE = p.Word('"')
SINGLE_QUOTE = p.Word("'")
COMMA = p.Suppress(p.Word(","))
EQUALS = p.Suppress(p.Word("="))
RIGHT_PAREN = p.Suppress(p.Word(")"))
LEFT_PAREN = p.Suppress(p.Word("("))
WORD = p.Word(p.alphanums + '<' + '<' + '>' + '/' + '.' + ':' + \
';' + '-' + '_' + '$' + '+' + '*' + '&' + '!' + '%' + '?' + '@' + '\\')
QUOTED_STRING = p.QuotedString("'") | p.QuotedString('"')
value = WORD | QUOTED_STRING
value_list = value + p.ZeroOrMore(COMMA + value)
keyword = WORD
pv1 = value
pv2 = (LEFT_PAREN + value_list + RIGHT_PAREN)
pv3 = p.Forward()
param = keyword + EQUALS + p.Group(p.OneOrMore(pv3) | pv2 | pv1)
pv3 << (LEFT_PAREN + param + RIGHT_PAREN)
parser = p.OneOrMore(p.Group(param))
tests = []
tests.append("""l1=v1""")
tests.append("""l1=(v1,v2,v3)""")
tests.append("""l1=(v1,v2,v3) l1=(v4, v5, v6)""")
tests.append("""l1=(l2=v1)""")
tests.append("""l1=v1 l1=v2""")
# This test fails
tests.append("""l1=(l2=(l3=v1))""")
results = []
for (i, test_string) in enumerate(tests):
try:
results.append(parser.parseString(test_string))
except Exception as e:
print("Failed test #{}".format(i))
print(e)
Where did I go wrong here?
Upvotes: 1
Views: 513
Reputation: 4532
It took me some time to figure this one out as I was checking whether your recursion was correct. But it turned out that your code was fine expect for 2 lines of code at the top of your code (which I assumed was corrected)
The error was caused by the fact that you set the parenthese using p.Word
instead of p.Literal
. So by changing you code into this it should work:
RIGHT_PAREN = p.Suppress(p.Literal(")"))
LEFT_PAREN = p.Suppress(p.Literal("("))
Just a reminder from the PyParsing wiki:
Literal - construct with a string to be matched exactly
Word - one or more contiguous characters; construct with a string containing the set of allowed initial characters, and an optional second string of allowed body characters;
Upvotes: 3