Reputation: 53
This is my actual code :
import bleach
import markdown
html = """
**### The Facebook Campaign will be alligned:**
**### What you get:**
"""
def render_markdown(text):
if not text:
return ''
html = markdown.markdown(text, extensions=[
'markdown.extensions.sane_lists',
'markdown.extensions.nl2br',
])
return bleach.clean(html, tags=[
'p', 'h1', 'h2', 'br', 'h3', 'b', 'strong', 'u', 'i', 'em', 'hr', 'ul', 'ol', 'li', 'blockquote'
])
print render_markdown(html)
problem is, same user add more time markdown code like ### What you get: , the conversion results is this :
<p><strong>### The Facebook Campaign will be aligned to your business goal (1 x goal per hour):</strong></p>
<p><strong>### What you get:</strong></p>
how could I prevent this situation? i want return clean html code without rimanend markdown code in text, the perfect output is this :
<p><strong>The Facebook Campaign will be aligned to your business goal (1 x goal per hour):</strong></p>
<p><strong>What you get:</strong></p>
Upvotes: 1
Views: 2374
Reputation: 10403
**### foo**
Is actually well parsed and really means:
<p><strong>### foo</strong></p>
It's not a Python or Markdown issue here, only a user that don't know how to format Markdown..
If you want to clean this you will have to parse user's input - but this is really not a Markdown question here, but a way more generic question that will surely requires some Regex.
<h3>Foo</h3>
do ### Foo
<p><strong>Foo</strong></p>
do **Foo**
.Ok, so you want to fix this specific case, here is how:
import re
string = '**### foo**'
print(re.sub('\*{2}\#+([^*]+)\*{2}', '** \\1 **', string))
Output
** foo **
So, final function:
def render_markdown(text):
if not text:
return ''
text = re.sub('\*{2}\#+([^*]+)\*{2}', '** \\1 **', text)
html = markdown.markdown(text, extensions=[
'markdown.extensions.sane_lists',
'markdown.extensions.nl2br',
])
return bleach.clean(html, tags=[
'p', 'h1', 'h2', 'br', 'h3', 'b', 'strong', 'u', 'i', 'em', 'hr', 'ul', 'ol', 'li', 'blockquote'
])
Upvotes: 2