sbaildon
sbaildon

Reputation: 248

Customising Pandoc writer element output

Is it possible to customise element outputs for a pandoc writer?

Given reStructuredText input

.. topic:: Topic Title

   Content in the topic

Using the HTML writer, Pandoc will generate

<div class="topic">
   <p><strong>Topic Title</strong></p>
   <p>Content in the topic</p>
</div>

Is there a supported way to change the html output? Say, <strong> to <mark>. Or adding another class the parent <div>.

edit: I've assumed the formatting is the responsibility of the writer, but it's also possible it's decided when the AST is created.

Upvotes: 1

Views: 512

Answers (1)

tarleb
tarleb

Reputation: 22609

This is what pandoc filters are for. Possibly the easiest way is to use Lua filters, as those are built into pandoc and don't require additional software to be installed.

The basic idea is that you'd match on an AST element created from the input, and produce raw output for your target format. So if all Strong elements were to be output as <mark> in HTML, you'd write

function Strong (element)
  -- the result will be the element's contents, which will no longer be 'strong'
  local result = element.content
  -- wrap contents in `<mark>` element
  result:insert(1, pandoc.RawInline('html', '<mark>'))
  result:insert(pandoc.RawInline('html', '</mark>'))
  return result
end

You'd usually want to inspect pandoc's internal representation by running pandoc --to=native YOUR_FILE.rst. This makes it easier to write a filter.

There is a similar question on the pandoc-discuss mailing list; it deals with LaTeX output, but is also about handling of custom rst elements. You might find it instructional.


Nota bene: the above can be shortened by using a feature of pandoc that outputs spans and divs with a class of a known HTML element as that element:

function Strong (element)
  return pandoc.Span(element.content, {class = 'mark'})
end

But I think it's easier to look at the general case first.

Upvotes: 1

Related Questions