Alexander Teut
Alexander Teut

Reputation: 343

How to remove a specific image from a PDF with PDFBox

I need to remove a specific image from PDF file according its metadata. Sadly. all examples I can find in Internet are using discarded methods.

I write it something like this:

try (PDDocument doc = PDDocument.load(new ByteArrayInputStream(pdf))) {
doc.getPages().forEach(page ->
{
    PDResources resources = page.getResources();
    List<COSName> itemsToRemove = new ArrayList<>();

    resources.getXObjectNames().forEach(propertyName -> {
        if(!resources.isImageXObject(propertyName)) {
            return;
        }
        PDXObject pdxObject = resources.getXObject(propertyName);
        PDImageXObject pdImageXObject = (PDImageXObject)pdxObject;
        PDMetadata metadata = pdImageXObject.getMetadata();
        if(checkMetadata(metadata)){
            // What should I use here?
            page.getCOSObject().removeItem(propertyName);
        }
    });
    // Should I use page.setResources(resources); ?
 });
doc.save(baos);
} catch (Exception e) {
//Code here

}

Upvotes: 2

Views: 2565

Answers (1)

Alexander Teut
Alexander Teut

Reputation: 343

It works same way like it does in example RemoveAllText.java, just with different tag.

Use code from this example, just use "Do" instead of "Tj".

Of course, if you need to load metadata, etc, you should enumerate and check images threw page resources (like in my example)

Upvotes: 2

Related Questions