Reputation: 88118
I'm writing a small conversion program that takes a reduced Markdown syntax to html (as a learning exercise) but I'm having trouble getting the spacing correct:
from pyparsing import *
strong = QuotedString("**")
text = Word(printables)
tokens = strong | text
grammar = OneOrMore(tokens)
strong.setParseAction(lambda x:"<strong>%s</strong>"%x[0])
A = "The **cat** in the **hat**."
print ' '.join(grammar.parseString(A))
What I get:
The <strong>cat</strong> in the <strong>hat</strong> .
What I would like:
The <strong>cat</strong> in the <strong>hat</strong>.
Yes this can be done without pyparsing and other utilities exist to do the exact same thing (e.g. pandoc) but I would like to know how to do this using pyparsing.
Upvotes: 3
Views: 509
Reputation: 36262
Not very skilled with pyparsing but I would try to use transformString()
instead of parseString()
, and leaveWhitespace()
for the tokens matched, like:
from pyparsing import *
strong = QuotedString("**").leaveWhitespace()
text = Word(printables).leaveWhitespace()
tokens = strong | text
grammar = OneOrMore(tokens)
strong.setParseAction(lambda x:"<strong>%s</strong>"%x[0])
A = "The **cat** in the **hat**."
print grammar.transformString(A)
It yields:
The <strong>cat</strong> in the <strong>hat</strong>.
UPDATE: Improved version pointed out by Paul McGuire (see comments):
from pyparsing import *
strong = QuotedString("**")
strong.setParseAction(lambda x:"<strong>%s</strong>"%x[0])
A = "The **cat** in the **hat**."
print strong.transformString(A)
Upvotes: 3