Reputation: 4391
I have a docx file containing a few equations in different pages. With Python and lxml, I was successful in extracting the content. I now need to convert the equations in Word to Latex. Some of the equations are shown as:
- eq \\f (sinx,\\r(1 - sin 2 x))
Is there any Python library of any tool that I can use to convert the equation to Latex format?
Here is a snippet of the XML file which I obtained from docxfile/word/document.xml:
<w:p w:rsidR="00677018" w:rsidRPr="007D05E5" w:rsidRDefault="00677018" w:rsidP="00677018">
<w:pPr>
<w:pStyle w:val="w" />
<w:jc w:val="both" /></w:pPr>
<w:r w:rsidRPr="007D05E5">
<w:tab/>
<w:t>a.</w:t>
</w:r>
<w:r w:rsidRPr="007D05E5">
<w:tab/></w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="begin" /></w:r>
<w:r w:rsidRPr="007D05E5">
<w:instrText xml:space="preserve">eq \b\bc\[(\a\co2\hs4(7,-3,-1,2))</w:instrText>
</w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="end" /></w:r>
<w:r w:rsidRPr="007D05E5">
<w:tab/>
<w:t>b.</w:t>
</w:r>
<w:r w:rsidRPr="007D05E5">
<w:tab/></w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="begin" /></w:r>
<w:r w:rsidRPr="007D05E5">
<w:instrText xml:space="preserve">eq \f(5,8)</w:instrText>
</w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="end" /></w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="begin" /></w:r>
<w:r w:rsidRPr="007D05E5">
<w:instrText xml:space="preserve">eq \b\bc\[(\a\co2\hs4(7,-3,-1,2))</w:instrText>
</w:r>
<w:r w:rsidR="00453EF1" w:rsidRPr="007D05E5">
<w:fldChar w:fldCharType="end" /></w:r>
</w:p>
Upvotes: 2
Views: 3315
Reputation: 28943
I'm not sure this constitutes an answer per se, but perhaps on the way to one.
I went looking for such a tool a while back and didn't find one, so I think the short answer is no.
Word supports more than one format for equations. The type you have is known as a "Word EQ Field equation". http://office.microsoft.com/en-us/word-help/field-codes-eq-equation-field-HP005186148.aspx
I don't find any Python solutions for this on search, and I know for certain python-docx doesn't support it. Wish I had better news for you :(
If you're determined, there appear to be some non-Python solutions out there that do this conversion, they might be an alternative or an example to study if you decide to whip one up yourself :)
Upvotes: 1