New2Java
New2Java

Reputation: 313

How to update page numbers in Table of Contents (TOC) or create a TOC with Page numbers in a Word document using Java and Apache POI

I am working on a Java 8 project where I need to modify a Word document template (.docx) using Apache POI v4.1.2. The template contains multiple sections with tables, paragraphs, and images. My task is to delete certain sections based on specific criteria and then update the content with new details to generate a final report.

I have successfully implemented the section deletion functionality. However, after deleting sections, the page numbers in the template change, and these changes are not reflected in the Table of Contents (TOC). I need assistance in programmatically updating the page numbers in the TOC with complete automation.

I have already tried xwpfDocument.enforceUpdateFields(), but it resulted in a popup on the document open, which is unacceptable to the stakeholders. Therefore, I'm looking for a programmatic solution to either update the stale TOC page numbers with the new ones Without any popup or create a new TOC or TOC Like structure with Section and Subsection headings with the page numbers.

Additionally, I have a constraint on using Apache POI. Changing the library at this point may not be a feasible solution as most of the logic is already written and working as expected. Also, I can not use Macro based approach due to security concerns.

Could anyone guide how to achieve this automated update or addition of TOC page numbers using Java and Apache POI? Any code snippets, suggestions, or any alternative approaches/hacks would be greatly appreciated.

Thank you in advance for your help!

Upvotes: 1

Views: 782

Answers (1)

Axel Richter
Axel Richter

Reputation: 61852

About looking for a canonical answer

Me not member of the Apache POI developer team but have bothered with Apache POI a long time. So I believe, I know something about it.

Short answer

Using Apache POI it is not possible to update table of content (TOC) of an XWPFDokument up to now. And I doubt that it will be possible later, except Apache POI will decide to program a renderer for documents.

A table of content consists of a list of headings (paragraphs having heading style) pointing to the page, that heading is placed. And that is the problem. To know on what page a paragraph gets placed, the document needs to be renderd. From file storages point of view, a document consists of an endlies stream of body elements. There may be explicit page breaks, but there need not. If not, only the renderer can determine on which page a body element gets placed. That is dependig on page size, page margins, font size/s, possible explicit row breaks, paragraph spacing/s and much more things.

Apache POI is only to create the Office Open XML files as Microsoft Office would store it. It does not provide renderers up to now. A little exception is XSLF (PowerPoint presentations). There someone has programmed a picture export of slides, which also needs a renderer for slides. But rendering slides is much more simple than rendering whole wordprocessing documents.

Going in detail

XWPFDocument provides XWPFDocument.createTOC but no methods to update a TOC. Not even a getter to get the TOC from the document is provided.

That easily could be changed by extending XWPFDocument. But what to do having the TOC then?

Looking into source code of TOC.java, we find method public void addRow(int level, String title, int page, String bookmarkRef). That method is to add a row to the table of content where the int page should give the page, the title is placed on.

Looking into the source code of XWPFDocument.java - createTOC, we find

...
toc.addRow(level, par.getText(), 1, "112723803");
...

That means a 1 gets set for each int page in each row of the table of content. Why? Well, because Apache POI cannot determine on what exact page the found paragraph having heading style is placed. Why? Well, because Apache POI cannot rendering the document. So it sets 1 and delegates the updating the TOC to Microsoft Word, as Microsoft Word will rendering the document while opening.

Conclusion

Using Apache POI it is not possible to update table of content (TOC) of an XWPFDokument up to now.

Upvotes: 1

Related Questions