Andrew Barr
Andrew Barr

Reputation: 3800

pandoc markdown to docx - keep list on one page

I have a markdown list like so:

* Question A - Answer 1 - Answer 2 - Answer 3 I need to ensure that all the answers (1 - 3) appear on the same page as Question A when I convert the markdown document to docx using pandoc. How can I do this?

Upvotes: 3

Views: 3935

Answers (1)

Waylan
Waylan

Reputation: 42627

Use custom styles in your Markdown and then define those styles in a custom docx template.

It's important to note that Pandoc's documentation states (emphasis added):

Because pandoc’s intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details...

Of course, Markdown has no concept of "pages" or "page breaks," so that is not something Pandoc can handle by default. However, Pandoc is aware of docx styles. As the documentation explains:

By default, pandoc’s docx output applies a predefined set of styles for blocks such as paragraphs and block quotes, and uses largely default formatting (italics, bold) for inlines. This will work for most purposes, especially alongside a reference.docx file. However, if you need to apply your own styles to blocks, or match a preexisting set of styles, pandoc allows you to define custom styles for blocks and text using divs and spans, respectively.

If you define a div or span with the attribute custom-style, pandoc will apply your specified style to the contained elements. So, for example using the bracketed_spans syntax,

[Get out]{custom-style="Emphatically"}, he said.

would produce a docx file with “Get out” styled with character style Emphatically. Similarly, using the fenced_divs syntax,

Dickinson starts the poem simply:

::: {custom-style="Poetry"}
| A Bird came down the Walk---
| He did not know I saw---
:::

would style the two contained lines with the Poetry paragraph style.

If the styles are not yet in your reference.docx, they will be defined in the output file as inheriting from normal text. If they are already defined, pandoc will not alter the definition.

If you don't want to define the style manually, but would like it applied to every list automatically (or perhaps to every list which follows a specific pattern), you could define a custom filter which applied the style(s) to every matching element in the document.

Of course, that only adds the style names to the output. You still need to define the styles (tell Word how to display elements assigned those styles). As the documentation for the --reference-doc option explains :

For best results, the reference docx should be a modified version of a docx file produced using pandoc. The contents of the reference docx are ignored, but its stylesheets and document properties (including margins, page size, header, and footer) are used in the new docx. If no reference docx is specified on the command line, pandoc will look for a file reference.docx in the user data directory (see --data-dir). If this is not found either, sensible defaults will be used.

To produce a custom reference.docx, first get a copy of the default reference.docx: pandoc --print-default-data-file reference.docx > custom-reference.docx. Then open custom-reference.docx in Word, modify the styles as you wish, and save the file.

Of course, when modifying the custom-reference.docx in Word, you can add your new custom style which you have used in your Markdown. As @CindyMeister points out in a comment:

Word would handle this using styles, where the Question style would have the paragraph setting "Keep with Next". the Answer style would have this as well. A third style, for the last entry, would NOT have the setting activated. In addition, all three styles would have the paragraph setting "Keep together" activated.

Finally, when using pandoc to convert your Markdown to a Word docx file, use the option --reference-doc=custom-reference.docx and your custom style definitions will be included in the generated docx file. As long as you also properly identify which elements in the Markdown document get which styles, your should have a list which doesn't get broken across a page break as long at the entire list fits on one page.

Upvotes: 8

Related Questions