Pyparsing Token Source Range

Question

How do I extract the source range (begin and end position) of a grammar rule match programmatically in Pyparsing? I can't use setParseAction for this (sub-)rule since I'm inspecting the parse tree contents inside another callback in turn specified as a ParseAction. I'm also missing a function to print, in humane way similar to pprint, the contents returned by parseString(). I'm aware of toList() but I'm not sure if interesting information, such as context, is stripped away by this member.

PaulMcG · Accepted Answer

Here's some sample code showing how to capture a parsed expression's location, and using dump() to list out parsed data and named results:

from pyparsing import *

# use an Empty to define a null token that just records its
# own location in the input string
locnMarker = Empty().leaveWhitespace().setParseAction(lambda s,l,t: l)

# define a example expression and save the start and end locations
markedInteger = locnMarker + Word(nums)('value') + locnMarker

# assign named results for the start and end values,
# and pop them out of the actual list of tokens
def markStartAndEnd(s,l,t):
    t['start'],t['end'] = t.pop(0),t.pop(-1)
markedInteger.setParseAction(markStartAndEnd)

# find all integers in the source string, and print
# their value, start, and end locations; use dump()
# to show the parsed tokens and any named results
source = "ljsdlj2342 sadlsfj132 sldfj12321 sldkjfsldj 1232"
for integer in markedInteger.searchString(source):
    print integer.dump()

Prints:

['2342']
- end: 11
- start: 6
- value: 2342
['132']
- end: 22
- start: 18
- value: 132
['12321']
- end: 33
- start: 27
- value: 12321
['1232']
- end: 48
- start: 44
- value: 1232

Pyparsing Token Source Range

Answers (1)

Related Questions