Ben
Ben

Reputation: 249

Convert html mathjax to markdown with pandoc

I have some html files including mathjax commands. I would like to translate it into php extra markdown using pandoc.

The problem is that pandoc add "\" before all math commands. For example \begin{equation} \$ x\^2 etc

Do you know how to avoid that with pandoc ? I think a related question is this one : How to convert HTML with mathjax into latex using pandoc?

Upvotes: 0

Views: 1793

Answers (1)

John MacFarlane
John MacFarlane

Reputation: 8937

You can write a short Haskell program unescape.hs:

-- Disable backslash escaping of special characters when writing strings to markdown.
import Text.Pandoc

main = toJsonFilter unescape
  where unescape (Str xs) = RawInline "markdown" xs
        unescape x        = x

Now compile with ghc --make unescape.hs. And use with

pandoc -f html -t json | ./unescape | pandoc -f json -t markdown

This will disable escaping of special characters (like $) in markdown output.

A simpler approach might be to pipe pandoc's normal markdown output through sed:

pandoc -f html -t markdown | sed -e 's/\\\([$^_*]\)/\1/g'

Upvotes: 2

Related Questions