Daniel F
Daniel F

Reputation: 340

convert list of tokens into XML output

I have a list of tokens as generated by pyparsing. I need to carry out manipulations on individual tokens in list based on the tokens around them. Currently, I am just using a for loop. Is there any better mechanism for doing this?

For instance, a simple example is [1, "+", 2] into

<block s="reportSum">
    <l>1</l>
    <l>2</l>
</block>

Edit: I have been reading the pyparsing docs, and know about operatorPrecedence and setParseAction. I am ultimately trying to transform one language into another.

For instance, say("hi") into <block s="bubble"><l>Hello!</l></block>. I am currently parsing say("hi") into ["say", "hi"], and would like to know how to transform that into the XML I have above.

Upvotes: 0

Views: 93

Answers (1)

PaulMcG
PaulMcG

Reputation: 63739

In infixNotation (aka operatorPrecedence), you can attach parse actions to each subexpression found. See below:

from pyparsing import *

opfunc = {
    '+': 'reportSum',
    '-': 'reportDifference',
    '*': 'reportProduct',
    '/': 'reportDivision',
    }
def makeXML(a, op, b):
    #~ print a,op,b
    return '<block s="%s"><l>%s</l><l>%s</l></block>' % (opfunc[op], a, b)

def outputBinary(tokens):
    t = tokens[0].asList()
    ret = makeXML(t.pop(0), t.pop(0), t.pop(0))
    while t:
        ret = makeXML(ret, t.pop(0), t.pop(0))
    return ret



integer = Word(nums)
# expand this to include other supported operands, like floats, variables, etc.
operand = integer

arithExpr = infixNotation(operand, 
    [
    (oneOf('* /'), 2, opAssoc.LEFT, outputBinary),
    (oneOf('+ -'), 2, opAssoc.LEFT, outputBinary),
    ])

tests = """\
    1+2
    1+2*5
    1+2*6/3
    1/4+3*4/2""".splitlines()

for t in tests:
    t = t.strip()
    print t
    print arithExpr.parseString(t)[0]
    print

giving:

1+2
<block s="reportSum"><l>1</l><l>2</l></block>

1+2*5
<block s="reportSum"><l>1</l><l><block s="reportProduct"><l>2</l><l>5</l></block></l></block>

1+2*6/3
<block s="reportSum"><l>1</l><l><block s="reportDivision"><l><block s="reportProduct"><l>2</l><l>6</l></block></l><l>3</l></block></l></block>

1/4+3*4/2
<block s="reportSum"><l><block s="reportDivision"><l>1</l><l>4</l></block></l><l><block s="reportDivision"><l><block s="reportProduct"><l>3</l><l>4</l></block></l><l>2</l></block></l></block>

Note that parsing '1+2+3' will not give the traditional [['1','+','2'],'+','3'] nested list, but the run-on sequence ['1','+','2','+','3'], which is why outputBinary has to iterate over the list beyond just the first 3 elements.

As for your say("hi") example, something like the following should help:

LPAR,RPAR = map(Suppress,"()")
say_command = Keyword("say")('cmd') + LPAR + delimitedList(QuotedString('"'))('args') + RPAR
ask_command = Keyword("ask")('cmd') + LPAR + delimitedList(QuotedString('"'))('args') + RPAR
cmd_func = {
    'say': 'bubble',
    'ask': 'prompt',
    }
def emitAsXML(tokens):
    func = cmd_func[tokens.cmd]
    args = ''.join('<l>%s</l>' % arg for arg in tokens.args)
    return """<block s="%s">%s</block>""" % (func, args)
cmd = (say_command | ask_command).setParseAction(emitAsXML)

tests = """\
    say("hi")
    say("hi","handsome")
    ask("what is your name?")""".splitlines()

for t in tests:
    t = t.strip()
    print t
    print cmd.parseString(t)[0]
    print

giving:

say("hi")
<block s="bubble"><l>hi</l></block>

say("hi","handsome")
<block s="bubble"><l>hi</l><l>handsome</l></block>

ask("what is your name?")
<block s="prompt"><l>what is your name?</l></block>

If you need a wider context to create some output, then just attach the parse action to the higher-level expression in your parser.

Upvotes: 2

Related Questions