Reputation: 49
I am trying to compress PDF document in Java. The original file size is 1.5-2 MB and we need to bring it down to less than 1 MB. I tried using iText compression on it, however the results are not that effective and file size is still greater than 1 MB.
byte[] mergedFileContent = byteArrayOS.toByteArray();
reader = new PdfReader(mergedFileContent);
PdfStamper stamper = new PdfStamper(reader, byteArrOScomp);
stamper.setFullCompression();
stamper.close();
reader.close();
Has anyone worked on something similar? Any inputs would be appreciated.
Upvotes: 2
Views: 6848
Reputation: 11747
A file object that has already been compressed to its personal maximum (as many PDF may already have been coded, compacted, compressed, and "crypted") can not be made any smaller by any significant amount unless some component is removed, thus the original qualities or functions destroyed.
As an Example take a similar size file as the original (1.5 to 2MB). This one is 1.82 MB (1,916,023 bytes). The Postscript source was only 1.21 KB. So surly it should be possible to reduce towards that smaller size?
Well it is soon clear on opening the PDF is has 4096 pages and removing any single page would fail to maintain its function.
WE CAN compress it some more via say an online compressor.
Your PDF are now 17% smaller! 1.83 MB >> 1.53 MB.
Which was achieved by optimisation (number of components /Size 12293 reduced to /Size 8413) NOT by compression which is the same compression (A mix of Zip & deFlate).
I also know that can be bettered info: optimized 4096 streams, kept 4096 #orig, means there is only now one stream per page but by add one more wrapper /Size 8414 can reduce that down to a file size now of 1.03 MB (1,081,466 bytes).
Decompressed the total objects are /Size 8412 = 3.13 MB (3,286,658 bytes).
Still Functional at about 32.9%, However it will never be under the OP desired 1.00 MB without some function loss.
Upvotes: 0
Reputation: 12547
PDFs are already compressed in a number of ways, which prevents the external compression utilities from gaining much ground. It should be obvious that if you unpack the PDF, then the external utilities would have an easier time finding redundancies and patterns to compress.
I know of no tool to unpack the PDF without reprinting it though. Ghostscript can reprint the exiting PDF into a new PDF, and we can tell it to avoid compression in that second version.
gs -dCompressPages=false -dCompressFonts=false -dCompressStreams=false -dEncodeColorImages=false -dEncodeGrayImages=false -dEncodeMonoImages=false -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dPDFSETTINGS=/screen -dSubsetFonts=true -dColorImageResolution=96 -dGrayImageResolution=96 -sOutputFile=raw-copy.pdf src.pdf
Even though the resulting copy is large (as it uses no compression), it can be more effectively packed with external tools
zpaq a raw-copy.pdf.zpaq raw-copy.pdf -m5 -fragment 9
zstd --ultra -22 raw-copy.pdf
A useful side effect is that we can compress together different versions of a text (unpacked PDFs, unpacked EPUBs, HTMLs, DOCs, RTFs and so on), eliminating redundancy and saving storage space across formats.
Upvotes: -1
Reputation: 1429
You could gzip, zip, etc. the file afterwards. It isn't really a PDF compression format, but if you are constrained and want better compression then compressing the entire thing may have good results since it can compress meta-level data.
Upvotes: -1
Reputation: 95918
You might want to look into the official iText examples, in particular the sample HelloWorldCompression is about applying different degrees of compression both at initial PDF creation time and as a post-processing step.
The following method from that sample may help you along.
public void compressPdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest), PdfWriter.VERSION_1_5);
stamper.getWriter().setCompressionLevel(9);
int total = reader.getNumberOfPages() + 1;
for (int i = 1; i < total; i++) {
reader.setPageContent(i, reader.getPageContent(i));
}
stamper.setFullCompression();
stamper.close();
reader.close();
}
If you wonder how I found it: I googled for "itextpdf example full compression" and it was the second result. (The first find contains the same method but is not from the official iText site.)
Upvotes: 4