Yuriy Yosipiv
Yuriy Yosipiv

Reputation: 63

How to update table of contents in docx-file with python on linux?

I've got a problem with updating table of contents in docx-file, generated by python-docx on Linux. Generally, it is not difficult to create TOC (Thanks for this answer https://stackoverflow.com/a/48622274/9472173 and this thread https://github.com/python-openxml/python-docx/issues/36)

from docx.oxml.ns import qn
from docx.oxml import OxmlElement

paragraph = self.document.add_paragraph()
run = paragraph.add_run()
fldChar = OxmlElement('w:fldChar')  # creates a new element
fldChar.set(qn('w:fldCharType'), 'begin')  # sets attribute on element
instrText = OxmlElement('w:instrText')
instrText.set(qn('xml:space'), 'preserve')  # sets attribute on element
instrText.text = 'TOC \o "1-3" \h \z \u'   # change 1-3 depending on heading levels you need

fldChar2 = OxmlElement('w:fldChar')
fldChar2.set(qn('w:fldCharType'), 'separate')
fldChar3 = OxmlElement('w:t')
fldChar3.text = "Right-click to update field."
fldChar2.append(fldChar3)

fldChar4 = OxmlElement('w:fldChar')
fldChar4.set(qn('w:fldCharType'), 'end')

r_element = run._r
r_element.append(fldChar)
r_element.append(instrText)
r_element.append(fldChar2)
r_element.append(fldChar4)
p_element = paragraph._p

But later to make TOC visible it requires to update fields. Mentioned bellow solution involves update it manually (right-click on TOC hint and choose 'update fields'). For the automatic updating, I've found the following solution with word application simulation (thanks to this answer https://stackoverflow.com/a/34818909/9472173)

import win32com.client
import inspect, os

def update_toc(docx_file):
    word = win32com.client.DispatchEx("Word.Application")
    doc = word.Documents.Open(docx_file)
    doc.TablesOfContents(1).Update()
    doc.Close(SaveChanges=True)
    word.Quit()

def main():
    script_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
    file_name = 'doc_with_toc.docx'
    file_path = os.path.join(script_dir, file_name)
    update_toc(file_path)

if __name__ == "__main__":
    main()

It pretty works on Windows, but obviously not on Linux. Have someone any ideas about how to provide the same functionality on Linux. The only one suggestion I have is to use local URLs (anchors) to every heading, but I am not sure is it possible with python-docx, also I'm not very strong with these openxml features. I will very appreciate any help.

Upvotes: 6

Views: 11552

Answers (2)

venkat
venkat

Reputation: 1

import docx.oxml.ns as ns

def update_table_of_contents(doc):
    # Find the settings element in the document
    settings_element = doc.settings.element

    # Create an "updateFields" element and set its "val" attribute to "true"
    update_fields_element = docx.oxml.shared.OxmlElement('w:updateFields')
    update_fields_element.set(ns.qn('w:val'), 'true')

    # Add the "updateFields" element to the settings element
    settings_element.append(update_fields_element)

Upvotes: 0

Faizan Amin
Faizan Amin

Reputation: 458

I found a solution from this Github Issue. It work on ubuntu.

def set_updatefields_true(docx_path):
    namespace = "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}"
    doc = Document(docx_path)
    # add child to doc.settings element
    element_updatefields = lxml.etree.SubElement(
        doc.settings.element, f"{namespace}updateFields"
    )
    element_updatefields.set(f"{namespace}val", "true")
    doc.save(docx_path)## Heading ##

Upvotes: 4

Related Questions