Sean D
Sean D

Reputation: 401

How to extract maths from latex documents

I'd like to be able to take a (potentially complex) LaTeX document and pull out the LaTeX source that would be rendered in mathmode. Options I can think of are,

Unfortunately greping is hacky and doesn't work with macros; extract seems to work, but is awkward to use; both pandoc and plasTeX have trouble with complicated "real-world" documents.

Am I overlooking any easier/more robust way to do this?

Upvotes: 1

Views: 594

Answers (1)

mb21
mb21

Reputation: 39199

While pandoc cannot represent more complicated layouts, it does support math and the pandoc LaTeX reader detects math environments very reliably. So I'd suggest writing a pandoc filter that drops everything but the Math elements. You can also write filters in python, but in Haskell something along the lines of:

#!/usr/bin/env runhaskell
-- dropNonMath.hs
import Text.Pandoc.JSON

main = toJSONFilter dropNonMath
  where dropNonMath (Math x y) = Math x y
        dropNonMath _ = []

then run it with:

pandoc --filter dropNonMath.hs -f latex -t latex input.tex

Upvotes: 2

Related Questions