Reputation: 538
Is there any possibility to use HTML tags in Rmarkdown documents rendered to word?
For example:
---
output: word_document
---
# This is rendered as heading
<h1> But this is not </h1>
Works perfectly when rendering as html_document, but not when rendering as a word_document.
A more specific question about tags has been asked here, but without solution: Underline in RMarkdown to Microsoft Word
Upvotes: 2
Views: 1428
Reputation: 22659
Sure, here we go:
---
output:
word_document:
md_extensions: +raw_html-markdown_in_html_blocks
pandoc_args: ['--lua-filter', 'read_html.lua']
---
# This is rendered as heading
<h1> And this is one, too </h1>
where read_html.lua
must be a file in the same directory with this content:
function RawBlock (raw)
if raw.format:match 'html' and not FORMAT:match 'html' then
return pandoc.read(raw.text, raw.format).blocks
end
end
Let's unpack the above to see how it works. The first thing you'll notice are the additional parameters to word_document
. The md_extensions
modify the way that pandoc parses the text, see here for a full list (or run pandoc --list-extensions=markdown
) in your terminal. We enable raw_html
to make sure that pandoc does not discard raw HTML tags, and disable markdown_in_html_blocks
as to ensure that we get the whole HTML tag as one block in pandoc's internal format.
The next setting is pandoc_args
, where we tell pandoc to use a Lua filter to modify the document during conversion. The filter picks out all HTML blocks, parses them as HTML instead of Markdown, and replaces the raw HTML with the parsing result.
So if you are using raw HTML that pandoc can read, you'll be fine. If you are using special instructions which pandoc cannot read, then the setup described above won't help either. You'd have to rewrite the markup in OOXML, the XML format used in docx.
Upvotes: 3