Reputation: 2265
We use pdfbox for in one of our applications. Some pdfs that are overlaid result in "broken" output and fonts.
Below is the sample code I'm using to overlay pdfs. The pdfs sometimes have different numbers of pages. We flatten acroforms and set annotations to read-only. Pdf page rotation and bbox sizing sometimes set differently (especially from scanners) so we try to correct for this.
PDDocument baseDocument = PDDocument.load(new File("base.pdf"));
PDDocument overlayDocument = PDDocument.load(new File("overlay.pdf"));
Iterator<PDPage> baseDocumentIterator = baseDocument.getPages().iterator();
Iterator<PDPage> overlayIterator = overlayDocument.getPages().iterator();
PDDocument finalOverlayDoc = new PDDocument();
while(baseDocumentIterator.hasNext() && overlayIterator.hasNext()) {
PDPage backing = baseDocumentIterator.next();
//locking annotations per page
List<PDAnnotation> annotations = backing.getAnnotations();
for (PDAnnotation a :annotations) {
a.setLocked(true);
a.setReadOnly(true);
}
// setting size so there's no weird overflow issues
PDRectangle rect = new PDRectangle();
rect.setLowerLeftX(0);
rect.setLowerLeftY(0);
rect.setUpperRightX(backing.getBBox().getWidth());
rect.setUpperRightY(backing.getBBox().getHeight());
backing.setCropBox(rect);
backing.setMediaBox(rect);
backing.setBleedBox(rect);
PDPage pg = overlayIterator.next();
//setting rotation if different. Some scanners cause issues.
if(backing.getRotation()!= pg.getRotation())
{
pg.setRotation(-backing.getRotation());
}
finalOverlayDoc.addPage(pg);
}
finalOverlayDoc.close();
//flatten acroform
PDAcroForm acroForm = baseDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null) {
acroForm.flatten();
acroForm.setNeedAppearances(false);
}
Overlay overlay = new Overlay();
overlay.setOverlayPosition(Overlay.Position.FOREGROUND);
overlay.setInputPDF(baseDocument);
overlay.setAllPagesOverlayPDF(finalOverlayDoc);
Map<Integer, String> ovmap = new HashMap<Integer, String>();
overlay.overlay(ovmap);
PDPageTree allOverlayPages = overlayDocument.getPages();
if(baseDocument.getPages().getCount() < overlayDocument.getPages().getCount()) //Additional pages in the overlay pdf need to be appended to the base pdf.
{
for(int i=baseDocument.getPages().getCount();i<allOverlayPages.getCount(); i++)
{
baseDocument.addPage(allOverlayPages.get(i));
}
}
PDDocument finalDocument = new PDDocument();
for(PDPage p: baseDocument.getPages()){
finalDocument.addPage(p);
}
String filename = "examples/merge_pdf_examples/debug.pdf";
filename = filename + new Date().getTime() + ".pdf";
finalDocument.save(filename);
finalDocument.close();
baseDocument.close();
overlayDocument.close();
Upvotes: 1
Views: 293
Reputation: 95928
There is no error in the PDF file you shared relevant for using Overlay
.
It uses one PDF feature which is seldom used, though, the pages inherit resources from their parent node: Page objects in a PDF are arranged in a tree with the actual pages being leaves; a page object in this tree often itself carries all the information defining it but a number of page properties can also be carried by an inner node and inherited by descendant pages unless they override them.
After you shared your code it turns out that you have a preparation step which loses all inherited information: When you generate finalOverlayDoc
from overlayDocument
you essentially do:
while(overlayIterator.hasNext()) {
PDPage pg = overlayIterator.next();
//setting rotation if different. Some scanners cause issues.
finalOverlayDoc.addPage(pg);
}
(OverlayDocuments test testOverlayPreparationExampleBroken
)
Here you only transport the page object itself, losing all inherited properties.
For the document at hand you can fix this by explicitly setting the page resources to the inherited ones:
while(overlayIterator.hasNext()) {
PDPage pg = overlayIterator.next();
pg.setResources(pg.getResources());
//setting rotation if different. Some scanners cause issues.
finalOverlayDoc.addPage(pg);
}
(OverlayDocuments test testOverlayPreparationFixedExampleBroken
)
Beware, though: This only explicitly sets the page resources but there also are other page attributes which can be inherited.
I would propose, therefore, that you don't create a new PDDocument
at all; instead of moving the overlayDocument
pages to finalOverlayDoc
only change them in place. If overlayDocument
has more pages than baseDocument
, you additionally have to remove excess pages from overlayDocument
. Then use overlayDocument
in overlaying instead of finalOverlayDoc
.
Looking further down your code I see you repeat the anti-pattern of moving page objects to other documents without respecting inherited properties again and again. I guess you should completely overhaul that code, removing that anti-pattern.
Upvotes: 2