Aaron Meier
Aaron Meier

Reputation: 950

Don't escape HTML with docutils.core.publish_parts(writer_name="html")

I'm trying to convert some of my HTML content to ReStructuredText. The problem is that I have a lot of custom HTML, so much that I'd abandon conversion if I had to write a special parser for each thing.

By default:

text = '''
  Heading
  =======
  <p class="jQuery-addThis">Test</p>
'''
docutils.core.publish_parts(text, writer_name='html')['html_body'] 

Escapes the <, >, and " with &lt;, &gt; and &quot;.

How do I tell publish_parts (or another function) to NOT convert the HTML?

Additional information:

I need this functionality for the following reasons:

I've been told to switch to Markdown, but I'd rather use ReStructuredText. Is this possible?

Thanks in advance!

Edit: I should've included "without the raw directive" in the subject line. I'm guessing that this is impossible, but if anyone knows of a way around it, I'd be very great full.

Upvotes: 3

Views: 625

Answers (1)

Pedro Romano
Pedro Romano

Reputation: 11203

You need to use the raw data pass-through directive. Your example would become:

text = '''
  Heading
  =======
  .. raw:: html

     <p class="jQuery-addThis">Test</p>
'''
docutils.core.publish_parts(text, writer_name='html')['html_body'] 

You can also look into using pandoc to automatically convert the HTML to restructuredText.

Upvotes: 2

Related Questions