Reputation: 63
I'm new to Pyparsing (and pretty new to Python). I have tried to reduce my problem down to the simplest form that will illustrate what's going wrong (to the point where I probably wouldn't need Pyparsing at all!)
Suppose I've got a string consisting of letters and numbers, such as "b7 z4 a2 d e c3". There's always a letter, but the number is optional. I want to parse this into its individual elements, and then process them, but where there is a bare letter, with no number, it would be handy to change it so that it had the "default" number 1 after it. Then I could process every element in a consistent way. I thought I could do this with a setparseAction, as follows:
from pyparsing import *
teststring = "a2 b5 c9 d e z"
expected_letter = Word("ABCDEFGabcdefgzZxy", exact=1)
expected_number = Word(nums)
letter_and_number = expected_letter + expected_number
bare_letter = expected_letter
bare_letter.setParseAction( lambda s,l,t: t.append("1") )
elements = letter_and_number | bare_letter
line = OneOrMore(elements)
print line.parseString(teststring)
Unfortunately, the t.append() doesn't do what I'm expecting, which was to add a "1" to the list of parsed tokens. Instead, I get an error: TypeError: 'str' object is not callable.
I'm probably just being really thick, here, but could one of you experts please set me straight.
Thanks
Steve
Upvotes: 6
Views: 1194
Reputation: 63739
One of the basic concepts to get about pyparsing is that it does not work with just lists of strings, but assembles the parsed pieces into a ParseResults object. ParseResults is a rich data type defined in pyparsing, that can be accessed as a list, or as a dict or object if there are tokens that have been parsed from a ParserElement with a defined results name.
However, while ParseResults was designed with easy access in mind, it is limited in ways it can be updated. Internally in pyparsing, each expression that matches creates a small ParseResults object; if this is part of a large expression, that expression accumulates the pieces into a large ParseResults using the += operator.
In your case, you can append to the ParseResults that is passed in by creating a small ParseResults containing "1" and adding it to t:
t += ParseResults("1")
Unfortunately, this won't work as a lambda - you could try
lambda s,l,t: t.__iadd__(ParseResults("1"))
But this feels a little too clever.
You might also rethink your parser a bit, to take advantage of the Optional class. Think of your trailing digit as an optional element, for which you can define a default value to provide in case the element is missing. I think you can define what you want with just:
>>> letter = Word(alphas,exact=1)
>>> digit = Word(nums,exact=1)
>>> teststring= "a2 b5 c9 d e z"
>>> letter_and_digit = Combine(letter + Optional(digit, default="1"))
>>> print (sum(letter_and_digit.searchString(teststring)))
['a2', 'b5', 'c9', 'd1', 'e1', 'z1']
Combine is used to rejoin the separate letters and digits into strings, otherwise each match would look like ['a','2'], ['b','5']
, etc.
(Normally, searchString returns a list of ParseResults objects, which would look like a list of single-element lists. By passing the results of searchString to sum
this adds them all into just one ParseResults of strings.)
Upvotes: 5