Markdown syntax with pyparsing, getting spaces correct

Question

I'm writing a small conversion program that takes a reduced Markdown syntax to html (as a learning exercise) but I'm having trouble getting the spacing correct:

from pyparsing import *

strong  = QuotedString("**")
text    = Word(printables)
tokens  = strong | text
grammar = OneOrMore(tokens)

strong.setParseAction(lambda x:"%s"%x[0])

A = "The **cat** in the **hat**."
print ' '.join(grammar.parseString(A))

What I get:

The cat in the hat .

What I would like:

The cat in the hat.

Yes this can be done without pyparsing and other utilities exist to do the exact same thing (e.g. pandoc) but I would like to know how to do this using pyparsing.

Birei · Accepted Answer

Not very skilled with pyparsing but I would try to use transformString() instead of parseString(), and leaveWhitespace() for the tokens matched, like:

from pyparsing import *

strong  = QuotedString("**").leaveWhitespace()
text    = Word(printables).leaveWhitespace()
tokens  = strong | text
grammar = OneOrMore(tokens)

strong.setParseAction(lambda x:"%s"%x[0])

A = "The **cat** in the **hat**."
print grammar.transformString(A)

It yields:

The cat in the hat.

UPDATE: Improved version pointed out by Paul McGuire (see comments):

from pyparsing import *

strong  = QuotedString("**")

strong.setParseAction(lambda x:"%s"%x[0])

A = "The **cat** in the **hat**."
print strong.transformString(A)

Markdown syntax with pyparsing, getting spaces correct

Answers (1)

Related Questions