Reputation: 7032
I'm building a tool to compress PDF files, and using pdfbox.
I have some images with the DCTDecode
+ FlateDecode
filter and I'd like to experiment with the JPXDecode
filter to see if it occupies less space.
I've seen some code using iText
, but how to do it with pdfbox
?. I've found no documentation how to do so.
Upvotes: 2
Views: 1734
Reputation: 18926
This code replaces the image stream without having to alter COSWriter (which sounds scary), however my experience with the PDF I tried was that the encoded image was incorrect, i.e. that there is a bug in the JPEG 2000 encoder, so check your result PDFs.
public class SO57972743
{
public static void main(String[] args) throws IOException
{
System.out.println("supported formats: " + Arrays.toString(ImageIO.getReaderFormatNames()));
try (PDDocument doc = PDDocument.load(new File("test.pdf")))
{
// get 1st level images only here (there may be more in form XObjects!)
PDResources res = doc.getPage(0).getResources();
for (COSName name : res.getXObjectNames())
{
PDXObject xObject = res.getXObject(name);
if (xObject instanceof PDImageXObject)
{
replaceImageWithJPX(xObject);
}
}
doc.save("test-result.pdf");
}
}
private static void replaceImageWithJPX(PDXObject xObject) throws IOException
{
PDImageXObject img = (PDImageXObject) xObject;
BufferedImage bim = img.getOpaqueImage(); // the mask (if there) won't be touched
ByteArrayOutputStream baos = new ByteArrayOutputStream();
boolean written = ImageIO.write(bim, "JPEG2000", baos);
if (!written)
{
System.err.println("write failed");
return;
}
// replace image stream
try (OutputStream os = img.getCOSObject().createRawOutputStream())
{
os.write(baos.toByteArray());
}
img.getCOSObject().setItem(COSName.FILTER, COSName.JPX_DECODE); // replace filter
img.getCOSObject().removeItem(COSName.COLORSPACE); // use the colorspace in the image itself
}
}
Upvotes: 2
Reputation: 7032
With pdfbox
it is possible to compress all images, by using a custom COSWriter
that handles all image streams and recodes them with the JPXDecode filter. pdfbox
isn't able to do so, but the JAI library with a plugin can generate a JPEG2000 image. Compression factor is configurable, and high compression ratios can be achieved without losing too much quality.
By using in addition the FlateDecode
filter, a little more compression can be obtained with no quality loss.
Upvotes: 1