Reputation: 65
I've read the posts How to split a PDF using Apache PDFBox? and How to merge two PDF files into one in Java? However, it only demonstrates how to split it at every page or into equal chucks and the merger apis for addSource() seem to only have File, String and InputStream and not PDDocument.
I would like to insert a one page pdf file into 3 places of a larger pdf file (say 100 pages) at specified pages numbers, e.g. pages 3, 7 and 10. So, I need to split the larger document at page 3, 7, 10, then insert the one page pdf doc, and then merge all the splits parts together in a new pdf file.
I have attempted to do as follows:
PDDocument doc;
PDDocument onePage;
Splitter splitDoc = new Splitter();
PDFMergerUtility mergedDoc = new PDFMergerUtility();
onePage = PDDocument.load("/path/onepage.pdf");
doc = PDDocument.load("/path/hundredpages.pdf");
splitDoc.setSplitAtPage(1); // inefficient
// is there a better solution for split?
List<PDDocument> splitDocs = splitDoc.split(doc);
for (int i=0; i<splitDocs.size(); i++) {
if (i==2 || i==7 || i==10) { // only to demonstrate
mergeFiles.addSource(onePage); // see comment below
} else {
// doesn't accept PDDocument
// what's the alternative without resorting to InputStream
mergeFiles.addSource(splitDocs.remove(0));
}
}
mergedDoc.setDestinationFileName("/path/mergeddoc.pdf");
mergedDoc.mergeDocuments();
Where are my going wrong or is there a better way?
Upvotes: 0
Views: 1868
Reputation: 95928
This answer is about what you actually want to achieve, i.e.
I would like to insert a one page pdf file into 3 places of a larger pdf file (say 100 pages) at specified pages numbers, e.g. pages 3, 7 and 10.
and not what you think you have to do for that, i.e.
So, I need to split the larger document at page 3, 7, 10, then insert the one page pdf doc, and then merge all the splits parts together in a new pdf file.
Furthermore, I assume you still are using a PDFBox version 1.8.x, not a 2.0.0 release candidate.
To insert pages into a document (represented by a PDDocument
instance) you actually don't have to split and re-merge that document, you merely have to add the page at the given indices. Thus, we can simplify the approach.
At the same time, though, there is a detail in your task which complicates it again: You cannot insert the identical page object multiple times into the same target document, one at least has to create a shallow copy of it.
Taking this into account, you can insert a one page pdf file into 3 places of a larger pdf:
PDDocument document = ...;
PDDocument singlePageDocument = ...;
PDPage singlePage = (PDPage) singlePageDocument.getDocumentCatalog().getAllPages().get(0);
PDPageNode rootPages = document.getDocumentCatalog().getPages();
rootPages.getKids().add(3-1, singlePage);
singlePage.setParent(rootPages);
singlePage = new PDPage(new COSDictionary(singlePage.getCOSDictionary()));
rootPages.getKids().add(7-1, singlePage);
singlePage = new PDPage(new COSDictionary(singlePage.getCOSDictionary()));
rootPages.getKids().add(10-1, singlePage);
rootPages.updateCount();
document.save(...);
(InsertPages.java method testInsertPages
)
Beware, though, this code assumes a flat page tree. In case of deeper page trees one has to walk the page list differently: To insert a page as nth document page, you cannot simply add it at position n-1 to the Pages root but instead have to inspect its kids one by one, and if coming along an inner PDPageNode
object, you have to read its Count
value to check the number of pages it contains; if this number implies that the position to insert at is contained within, you have to recurse into that inner PDPageNode
object.
Upvotes: 1