Reputation: 15139
Is it possible to somehow tell pandoc
to carry the names of styles from original HTML to .docx?
I understand that in order to tune the actual styles, I should be using reference.docx
file generated by pandoc
. However, reference.docx
is limited to what styles it has to: headings, body text, block text, etc.
I'd like to:
specify "myStyle" style in the input HTML (via a "class" attribute, via any other HTML attribute or even via a filter code written in Lua),
<html>
<body>
<p>Hello</p>
<p class="myStyle">World!</p>
</body>
</html>
add a custom "myStyle" to reference.docx
using Word,
run a html->docx
conversion an expect pandoc
generate a paragraph element with "myStyle" (instead of BodyText
, which I believe it sets by default), so the end result looks like this (contents of word/document.xml
inside the resulting output.docx
was cut for brevity):
<w:p>
<w:pPr>
<w:pStyle w:val="BodyText" />
</w:pPr>
<w:r>
<w:txml:space="preserve">Hello</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="myStyle" />
</w:pPr>
<w:r>
<w:txml:space="preserve">World!</w:t>
</w:r>
</w:p>
There's some evidence styleId
can be passed around, but I don't really understand it and am unable to find any documentation about it.
Doc on filtering in Lua states you can access attrs
when manipulating a pandoc.div
, but it says nothing about whether any of the attrs will be interpreted by pandoc in any meaningful way.
Upvotes: 4
Views: 820
Reputation: 15139
Finally, found what I needed – Custom styles. It's limited, but better than what I arrived earlier, and of course much better than nothing at all :)
I'll leave a step-by-step guide here in case anyone stumbles upon a similar question.
First, generate a reference.docx
file like this:
pandoc --print-default-data-file reference.docx > styles.docx
Then open the file in MS Word (I was using a macOS version) you'll see this:
Click the "New style..." button on the right, and create a style to your liking. In my case I made change the style of text to be bold, in blue color:
Since I am converting from HTML to DOCX, here's my input.html
:
<html>
<body>
<div>Page 1</div>
<div custom-style="eugene-is-testing">Page 2</div>
<div>Page 3</div>
</body>
</html>
Run:
pandoc --standalone --reference-doc styles.docx --output output.docx input.html
Finally, enjoy the result:
Upvotes: 5