hagem
hagem

Reputation: 189

Apache PDFBox and PDF/A-3

Is it possible to use Apache PDFBox to process PDF/A-3 documents? (Especially for changing field values?)

The PDFBox 1.8 Cookbook says that it is possible to create PDF/A-1 documents with pdfaid.setPart(1);

  1. Can I apply pdfaid.setPart(3) for a PDF/A-3 document?

  2. If not: Is it possible to read in a PDF/A-3 document, change some field values and safe it by what I have not need for >creation/conversion to PDF/A-3< but the document is still PDF/A-3?

Upvotes: 3

Views: 12749

Answers (2)

Madal Africa-Guinea
Madal Africa-Guinea

Reputation: 59

How to create a PDF/A {2,3} - {B, U, A) valid: In this example I convert the PDF to Image, then I create a valid PDF / Ax-y with the image. PDFBOX2.0x

public static void main(String[] args) throws IOException, TransformerException
{

    String resultFile = "result/PDFA-x.PDF";  
    FileInputStream in = new FileInputStream("src/PDFOrigin.PDF");

    PDDocument doc = new PDDocument();
    try 
    {
        PDPage page = new PDPage();
        doc.addPage(page); 
        doc.setVersion(1.7f);

        /*             
        // A PDF/A file needs to have the font embedded if the font is used for text rendering
        // in rendering modes other than text rendering mode 3.
        //
        // This requirement includes the PDF standard fonts, so don't use their static PDFType1Font classes such as
        // PDFType1Font.HELVETICA.
        //
        // As there are many different font licenses it is up to the developer to check if the license terms for the
        // font loaded allows embedding in the PDF.

        String fontfile = "/org/apache/pdfbox/resources/ttf/ArialMT.ttf"; 
        PDFont font = PDType0Font.load(doc, new File(fontfile));           
        if (!font.isEmbedded())
        {
            throw new IllegalStateException("PDF/A compliance requires that all fonts used for"
                    + " text rendering in rendering modes other than rendering mode 3 are embedded.");
        }
      */ 

        PDPageContentStream contents = new PDPageContentStream(doc, page);
        try 
        {   
            PDDocument docSource = PDDocument.load(in);
            PDFRenderer pdfRenderer = new PDFRenderer(docSource);               
            int numPage = 0;

            BufferedImage imagePage = pdfRenderer.renderImageWithDPI(numPage, 200); 
            PDImageXObject pdfXOImage = LosslessFactory.createFromImage(doc, imagePage);

            contents.drawImage(pdfXOImage, 0,0, page.getMediaBox().getWidth(), page.getMediaBox().getHeight());
            contents.close();   

        }catch (Exception e) {
            // TODO: handle exception
        }

        // add XMP metadata
        XMPMetadata xmp = XMPMetadata.createXMPMetadata();
        PDDocumentCatalog catalogue = doc.getDocumentCatalog();
        Calendar cal =  Calendar.getInstance();          

        try
        {
            DublinCoreSchema dc = xmp.createAndAddDublinCoreSchema();
           // dc.setTitle(file);
            dc.addCreator("My APPLICATION Creator");
            dc.addDate(cal);

            PDFAIdentificationSchema id = xmp.createAndAddPFAIdentificationSchema();
            id.setPart(3);  //value => 2|3
            id.setConformance("A"); // value => A|B|U

            XmpSerializer serializer = new XmpSerializer();
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            serializer.serialize(xmp, baos, true);

            PDMetadata metadata = new PDMetadata(doc);
            metadata.importXMPMetadata(baos.toByteArray());                
            catalogue.setMetadata(metadata);
        }
        catch(BadFieldValueException e)
        {
            throw new IllegalArgumentException(e);
        }

        // sRGB output intent
        InputStream colorProfile = CreatePDFA.class.getResourceAsStream(
                "../../../pdmodel/sRGB.icc");
        PDOutputIntent intent = new PDOutputIntent(doc, colorProfile);
        intent.setInfo("sRGB IEC61966-2.1");
        intent.setOutputCondition("sRGB IEC61966-2.1");
        intent.setOutputConditionIdentifier("sRGB IEC61966-2.1");
        intent.setRegistryName("http://www.color.org");

        catalogue.addOutputIntent(intent);  
        catalogue.setLanguage("en-US");

        PDViewerPreferences pdViewer =new PDViewerPreferences(page.getCOSObject());
        pdViewer.setDisplayDocTitle(true);; 
        catalogue.setViewerPreferences(pdViewer);

        PDMarkInfo  mark = new PDMarkInfo(); // new PDMarkInfo(page.getCOSObject()); 
        PDStructureTreeRoot treeRoot = new PDStructureTreeRoot(); 
        catalogue.setMarkInfo(mark);
        catalogue.setStructureTreeRoot(treeRoot);           
        catalogue.getMarkInfo().setMarked(true);

        PDDocumentInformation info = doc.getDocumentInformation();               
        info.setCreationDate(cal);
        info.setModificationDate(cal);            
        info.setAuthor("My APPLICATION Author");
        info.setProducer("My APPLICATION Producer");;
        info.setCreator("My APPLICATION Creator");
        info.setTitle("PDF title");
        info.setSubject("PDF to PDF/A{2,3}-{A,U,B}");           

        doc.save(resultFile);
    }catch (Exception e) {
        throw new IllegalArgumentException(e);
    }
}

Upvotes: 5

hagem
hagem

Reputation: 189

PDFBox supports that but please be aware that due to the fact that PDFBox is a low level library you have to ensure the conformance yourself i.e. there is no 'Save as PDF/A-3'. You might want to take a look at http://www.mustangproject.org which uses PDFBox to support ZUGFeRD (electronic invoicing) which also needs PDF/A-3.

Upvotes: 5

Related Questions