Filippo Angileri
Filippo Angileri

Reputation: 3

How to reduce the output pdf size when bind the same pdf many times java

I try to write a binder in java for make a "two-up" layer using many times the same pdf problem is that the output size of the file is too big, how to optimize it?

I put SmartMode but not help.

String infile = "D:\\libro\\libro.pdf";
String outfile = "D:\\libro\\test_out.pdf";
FileOutputStream output = new FileOutputStream(outfile);
PdfDocument pdfDoc = new PdfDocument(new  PdfWriter(outfile).setSmartMode(true));
pdfDoc.setDefaultPageSize(PageSize.A2.rotate());
PdfPage pageorig, pagenew;
PdfCanvas canvas;
PdfDocument reader = new PdfDocument(new PdfReader(infile));

int pages = reader.getNumberOfPages();
for (int j = 0; j < 10; j++) {
    for (int i = 1; i <= pages; i++) {

        pageorig = reader.getPage(i);
        pagenew = pdfDoc.addNewPage();
        canvas = new PdfCanvas(pagenew);

        canvas.addXObject(pageorig.copyAsFormXObject(pdfDoc), 0, 0);
        canvas.addXObject(pageorig.copyAsFormXObject(pdfDoc), pageorig.getPageSize().getWidth(), 0);

    }
}
pdfDoc.close();
reader.close();

The original pdf size is 20Mb, if make one copy the output file is 19Mb BUT if i make 10 copies the outfile is 83Mb...that is very big

EDIT : link to pdf usedpdf

Upvotes: 0

Views: 202

Answers (1)

Alexey Subach
Alexey Subach

Reputation: 12312

A simple algorithmic optimization helps here to make the code way much faster and solve your issue with the resultant file size as well. Instead of making a new copy of the page each time you want to insert it (essentially 10 * 2 times), you can make a copy of each page once and then use it 10 * 2 times. In my code I'm using lazy caching with a Map and making a copy when we have a miss (page was not copied yet). It could have been done in another way as well - passing over the document pages and making a fresh copy in advance.

Here is the optimized version of the code:

String infile = "D:\\libro.pdf";
String outfile = "D:\\test_out.pdf";
PdfDocument pdfDoc = new PdfDocument(new  PdfWriter(outfile).setSmartMode(true));
pdfDoc.setDefaultPageSize(PageSize.A2.rotate());
PdfPage pageorig, pagenew;
PdfCanvas canvas;
PdfDocument reader = new PdfDocument(new PdfReader(infile));

// Caching page copies
Map<Integer, PdfFormXObject> pageCopies = new HashMap<>();

int pages = reader.getNumberOfPages();
for (int j = 0; j < 10; j++) {
    for (int i = 1; i <= pages; i++) {
        pageorig = reader.getPage(i);
        PdfFormXObject origPageCopy = pageCopies.get(i);
        // Cache miss, doing a fresh copy
        if (origPageCopy == null) {
            origPageCopy = pageorig.copyAsFormXObject(pdfDoc);
            pageCopies.put(i, origPageCopy);
        }

        pagenew = pdfDoc.addNewPage();
        canvas = new PdfCanvas(pagenew);

        canvas.addXObject(origPageCopy, 0, 0);
        canvas.addXObject(origPageCopy, pageorig.getPageSize().getWidth(), 0);
    }
}
pdfDoc.close();
reader.close();

On my machine, the resultant file size is ~15MB, even less than the original file size. Additionally, this code runs in ~3 seconds compared to ~25 seconds with the initial version of the code.

Upvotes: 1

Related Questions