Rob Curtis
Rob Curtis

Reputation: 2265

PDF issue with fonts in adobe, after using overlays in PDFBox

We use pdfbox for in one of our applications. Some pdfs that are overlaid result in "broken" output and fonts.

Below is the sample code I'm using to overlay pdfs. The pdfs sometimes have different numbers of pages. We flatten acroforms and set annotations to read-only. Pdf page rotation and bbox sizing sometimes set differently (especially from scanners) so we try to correct for this.

    PDDocument baseDocument = PDDocument.load(new File("base.pdf"));
    PDDocument overlayDocument = PDDocument.load(new File("overlay.pdf"));
    Iterator<PDPage> baseDocumentIterator = baseDocument.getPages().iterator();
    Iterator<PDPage> overlayIterator = overlayDocument.getPages().iterator();
    PDDocument finalOverlayDoc = new PDDocument();
    while(baseDocumentIterator.hasNext() && overlayIterator.hasNext()) {
        PDPage backing = baseDocumentIterator.next();
        //locking annotations per page
        List<PDAnnotation> annotations = backing.getAnnotations();
        for (PDAnnotation a :annotations) {
            a.setLocked(true);
            a.setReadOnly(true);
        }
        // setting size so there's no weird overflow issues
        PDRectangle rect = new PDRectangle();
        rect.setLowerLeftX(0);
        rect.setLowerLeftY(0);
        rect.setUpperRightX(backing.getBBox().getWidth());
        rect.setUpperRightY(backing.getBBox().getHeight());
        backing.setCropBox(rect);
        backing.setMediaBox(rect);
        backing.setBleedBox(rect);
        PDPage pg = overlayIterator.next();
        //setting rotation if different. Some scanners cause issues.
        if(backing.getRotation()!= pg.getRotation())
        {
            pg.setRotation(-backing.getRotation());
        }
        finalOverlayDoc.addPage(pg);
    }
    finalOverlayDoc.close();
    //flatten acroform
    PDAcroForm acroForm = baseDocument.getDocumentCatalog().getAcroForm();
    if (acroForm != null) {
        acroForm.flatten();
        acroForm.setNeedAppearances(false);
    }
    Overlay overlay = new Overlay();
    overlay.setOverlayPosition(Overlay.Position.FOREGROUND);
    overlay.setInputPDF(baseDocument);
    overlay.setAllPagesOverlayPDF(finalOverlayDoc);

    Map<Integer, String> ovmap = new HashMap<Integer, String>();
    overlay.overlay(ovmap);
    PDPageTree allOverlayPages = overlayDocument.getPages();
    if(baseDocument.getPages().getCount() < overlayDocument.getPages().getCount()) //Additional pages in the overlay pdf need to be appended to the base pdf.
    {
        for(int i=baseDocument.getPages().getCount();i<allOverlayPages.getCount(); i++)
        {
            baseDocument.addPage(allOverlayPages.get(i));
        }
    }
    PDDocument finalDocument = new PDDocument();
    for(PDPage p: baseDocument.getPages()){
        finalDocument.addPage(p);
    }

    String filename = "examples/merge_pdf_examples/debug.pdf";
    filename = filename + new Date().getTime() + ".pdf";
    finalDocument.save(filename);
    finalDocument.close();
    baseDocument.close();
    overlayDocument.close();

Upvotes: 1

Views: 293

Answers (1)

mkl
mkl

Reputation: 95928

There is no error in the PDF file you shared relevant for using Overlay.

It uses one PDF feature which is seldom used, though, the pages inherit resources from their parent node: Page objects in a PDF are arranged in a tree with the actual pages being leaves; a page object in this tree often itself carries all the information defining it but a number of page properties can also be carried by an inner node and inherited by descendant pages unless they override them.

After you shared your code it turns out that you have a preparation step which loses all inherited information: When you generate finalOverlayDoc from overlayDocument you essentially do:

while(overlayIterator.hasNext()) {
    PDPage pg = overlayIterator.next();
    //setting rotation if different. Some scanners cause issues.
    finalOverlayDoc.addPage(pg);
}

(OverlayDocuments test testOverlayPreparationExampleBroken)

Here you only transport the page object itself, losing all inherited properties.

For the document at hand you can fix this by explicitly setting the page resources to the inherited ones:

while(overlayIterator.hasNext()) {
    PDPage pg = overlayIterator.next();
    pg.setResources(pg.getResources());
    //setting rotation if different. Some scanners cause issues.
    finalOverlayDoc.addPage(pg);
}

(OverlayDocuments test testOverlayPreparationFixedExampleBroken)

Beware, though: This only explicitly sets the page resources but there also are other page attributes which can be inherited.

I would propose, therefore, that you don't create a new PDDocument at all; instead of moving the overlayDocument pages to finalOverlayDoc only change them in place. If overlayDocument has more pages than baseDocument, you additionally have to remove excess pages from overlayDocument. Then use overlayDocument in overlaying instead of finalOverlayDoc.


Looking further down your code I see you repeat the anti-pattern of moving page objects to other documents without respecting inherited properties again and again. I guess you should completely overhaul that code, removing that anti-pattern.

Upvotes: 2

Related Questions