Chapo
Chapo

Reputation: 2543

PyPDF2 nested bookmarks with same name not working

When you try and nest several bookmarks with the same name, PyPDF2 does not take it into account. Below self-contained python code to test what I mean (you need at have 3 pdf files named a, b and c in the working folder to test it out)

from PyPDF2 import PdfFileReader, PdfFileMerger


def main():
    merger = PdfFileMerger()
    first_one = True
    for file in ["a.pdf", "b.pdf", "c.pdf"]:
        print("next row")
        reader = PdfFileReader(file)
        merger.append(reader)
        if first_one:
            child = merger.addBookmark(title="blabla", pagenum=1)
            first_one = False
        else:
            child = merger.addBookmark(title="blabla", pagenum=1, parent=child)

    merger.write("test.pdf")


if __name__ == "__main__":
    main()

I would expect the resulting pdf to have three levels of nested bookmarks

blabla
    blabla
        blabla

but instead I get

blabla
    blabla
    blabla

Is there any way to make sure this does not happen ?

EDIT : I have removed the pagenum variable as I want those 3 bookmarks to point to the same page.

Upvotes: 10

Views: 1896

Answers (1)

kabdulla
kabdulla

Reputation: 5419

This seems to be a bug with PdfFileMerger.addBookmark() method. There is some detail here

Below is a work-around using PdfFileWriter and its addBookmark() method. Using this I can get 3 nested bookmarks, with same name, all on the same page:

blabla
    blabla
        blabla

Code using PdfFileWriter work-around:

from PyPDF2 import PdfFileReader, PdfFileWriter


def main():
    writer = PdfFileWriter()
    pagenum = 0
    first_one = True
    for file in ["a.pdf", "b.pdf", "c.pdf"]:
        print("next row")
        reader = PdfFileReader(file)
        writer.appendPagesFromReader(reader)
        if first_one:
            child = writer.addBookmark(
                title="blabla", pagenum=pagenum, parent=None
            )
            first_one = False
        else:
            child = writer.addBookmark(
                title="blabla", pagenum=pagenum, parent=child
            )

    with open("test.pdf", "wb") as d:
        writer.write(d)


if __name__ == "__main__":
    main()

Alternatively, I had a go at modifying the PyPDF2 library to resolve this issue, although I'm not very experienced at python so may have introduced new/other issues! Have submitted a pull-request to the maintainers, but until then you could clone my fork, and install PyPDF2 from there:

git clone https://github.com/khalida/PyPDF2.git
cd PyPDF2
python setup.py sdist
sudo -H pip uninstall -y PyPDF2
sudo -H pip install dist/PyPDF2-1.26.0.tar.gz

After that you should be able to get the nesting you want from PdfFileMerger.addBookmark(). I've tested it for the case above, but haven't done any testing beyond that.

Upvotes: 5

Related Questions