MoonLock68
MoonLock68

Reputation: 11

PDFBox 3.0.2: Can't seem to properly tag image for accessibility

I'm trying to create a PDF document with better accessibility. It doesn't need to pass any standards but I would like to add alt text to images. I'm able to properly tag text, but I can't seem to tag images properly when using a similar workflow. I referenced this post and its related post to create my current code.

My code so far:

public static void main(String[] args) throws IOException {
        int mcidCounter = 0;
        int structParentCounter = 0;
        PDDocument document = new PDDocument();
        PDPage page = new PDPage(PDRectangle.A4);
        document.addPage(page);

        page.setStructParents(structParentCounter);

        PDPageContentStream contentStream = null;
        try {
            contentStream = new PDPageContentStream(document, page);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

        PDImageXObject pdImage = PDImageXObject.createFromFile("image_file", document);

        PDDocumentCatalog catalog = document.getDocumentCatalog();
        PDStructureTreeRoot structureTreeRoot = new PDStructureTreeRoot();
        catalog.setStructureTreeRoot(structureTreeRoot);

        PDViewerPreferences prefs = new PDViewerPreferences(new COSDictionary());
        prefs.setDisplayDocTitle(true);
        catalog.setViewerPreferences(prefs);

        PDMarkInfo markInfo = new PDMarkInfo();
        markInfo.setMarked(true);
        catalog.setMarkInfo(markInfo);

        PDStructureElement documentElement = new PDStructureElement(StandardStructureTypes.DOCUMENT, structureTreeRoot);
        structureTreeRoot.appendKid(documentElement);

        PDStructureElement paragraphElement = new PDStructureElement(StandardStructureTypes.P, documentElement);
        paragraphElement.setPage(page);
        documentElement.appendKid(paragraphElement);

        COSDictionary markedContentDictionary = new COSDictionary();
        markedContentDictionary.setInt(COSName.MCID, mcidCounter);

        PDMarkedContentReference mcr = new PDMarkedContentReference();
        mcr.setMCID(mcidCounter);
        paragraphElement.appendKid(mcr);

        contentStream.beginMarkedContent(COSName.P, PDPropertyList.create(markedContentDictionary));
        contentStream.setFont(new PDType1Font(Standard14Fonts.FontName.HELVETICA_BOLD), 12);
        contentStream.beginText();
        contentStream.newLineAtOffset(50, 700);
        contentStream.showText("Document Title");
        contentStream.endText();
        contentStream.endMarkedContent();

        PDStructureElement figureElement = new PDStructureElement(StandardStructureTypes.Figure, documentElement);
        figureElement.setPage(page);
        figureElement.setAlternateDescription("Alternate Image Description");
        documentElement.appendKid(figureElement);

        COSDictionary markedContentDictionary3 = new COSDictionary();
        markedContentDictionary3.setInt(COSName.MCID, mcidCounter + 2);
        markedContentDictionary3.setString(COSName.ALT, "Alternate Image Description");

        PDMarkedContentReference mcr3 = new PDMarkedContentReference();
        mcr3.setMCID(mcidCounter + 2);
        figureElement.appendKid(mcr3);

        contentStream.beginMarkedContent(COSName.IMAGE, PDPropertyList.create(markedContentDictionary3));
        contentStream.drawImage(pdImage, 50, 0);
        contentStream.endMarkedContent();

        contentStream.close();

        COSDictionary parentTreeRoot = new COSDictionary();
        PDNumberTreeNode parentTree = new PDNumberTreeNode(parentTreeRoot, COSBase.class);

        Map<Integer, COSObjectable> parentTreeMap = new HashMap<>();
        parentTreeMap.put(structParentCounter, paragraphElement);
        parentTree.setNumbers(parentTreeMap);
        structureTreeRoot.setParentTree(parentTree);

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        document.save(outputStream);

        byte[] pdfBytes = outputStream.toByteArray();
        document.close();

        Path actualPath = Path.of("test.pdf");
        Files.write(actualPath, pdfBytes, StandardOpenOption.CREATE, StandardOpenOption.WRITE);
    }

This is the internal structure of the PDF PDF structure.
I can't see any Structure tree even though I link it to the document catalog.

Is there something missing in the code or am I misunderstanding what I should be seeing?

Upvotes: 1

Views: 36

Answers (0)

Related Questions