rangarajan
rangarajan

Reputation: 191

Generating word documents with equations and formulae using python

Is there any way to generate word documents with equations and formulae using python? Now, I'm using python-docx to generate word documents. I have checked the documentation of python-docx but I didn't find anything related to formulae and equations.

I need to both extract formulae from the word document as well as generate a new word document with that extracted formulae.

Is it possible in python to extract equations from word document and store it in a database or something and then generate a word document with that formulae and equations???

Chemical equations

EDIT: I have attached some of the equations I need to extract/generate

Upvotes: 0

Views: 3646

Answers (2)

denpost
denpost

Reputation: 399

Here is the code how to generate MS Word document with the equation from MathML string.

NOTE: You will need MML2OMML.XSL file that you can find in the MS Office distribution, e.g.: in C:\Program Files\Microsoft Office\root\Office16\MML2OMML.XSL.

from docx import Document
from lxml import etree

# MathML representation of "(x+y)²"
mathml_string = """
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <msup>
    <mrow>
      <mfenced>
        <mrow>
          <mi>x</mi>
          <mo>+</mo>
          <mi>y</mi>
        </mrow>
      </mfenced>
    </mrow>
    <mn>2</mn>
  </msup>
</math>
"""

# parse XML from MathML string content
mathml_tree = etree.fromstring(mathml_string)

# convert MathML to Office MathML (OMML) using XSLT
# NOTE: You can find "MML2OMML.XSL" in the MS Office distribution, e.g.: "C:\Program Files\Microsoft Office\root\Office16\MML2OMML.XSL"
xslt = etree.parse('MML2OMML.XSL')
transform = etree.XSLT(xslt)
omml_tree = transform(mathml_tree)

# Serialize the Office MathML (OMML) to a string
omml_string = etree.tostring(omml_tree, pretty_print=True, encoding="unicode")

# Write to Word document
document = Document()
p = document.add_paragraph()

# Append the converted OMML to the paragraph
p._element.append(omml_tree.getroot())  # Append the root element of the OMML tree

# Save the document
document.save("simpleEq_with_Formula.docx")

Inspired by other answers on StackOF.

Upvotes: 0

aaossa
aaossa

Reputation: 3852

Seems like python-docs does not support adding math equations since "there is currently no API support for equations" (GitHub issue).

Upvotes: 1

Related Questions