Mohan
Mohan

Reputation: 13

PDFBox renderImageWithDPI produces images with missing content due to absent embedded fonts - how do I resolve this?

PDFBox renderImageWithDPI only partially renders text because of missing embedded(?) fonts.

1.4 : Trailer Syntax error, /XRef cross reference streams are not allowed
5.2.2 : Forbidden field in an annotation definition, Flags of Link annotation are invalid
2.3.2 : Unexpected value for key in Graphic object definition, Unexpected 'true' value for 'Interpolate' Key
2.4.2 : Invalid Color space, The operator "k" can't be used with RGB Profile
2.4.3 : Invalid Color space, The operator "f" can't be used without Color Profile
3.1.4 : Invalid Font definition, ELWKFI+OptimaLTStd: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, JECWGC+InsigniaLTStd: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, PHSMMZ+OptimaLTStd-Bold: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, EHCNNL+OptimaLTStd-Italic: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, QBVSKF+HelveticaLTStd-Obl: The Charset entry is missing for the Type1 Subset
3.1.9 : Invalid Font definition, UBAPGG+OptimaLTStd: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, UBAPGG+OptimaLTStd: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, UBAPGG+OptimaLTStd: The FontFile can't be read
3.1.9 : Invalid Font definition, ORMCFE+HelveticaLTStd-Obl: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, ORMCFE+HelveticaLTStd-Obl: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, ORMCFE+HelveticaLTStd-Obl: The FontFile can't be read
3.1.9 : Invalid Font definition, TFEWKU+HelveticaLTStd-Roman: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, TFEWKU+HelveticaLTStd-Roman: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, TFEWKU+HelveticaLTStd-Roman: The FontFile can't be read
3.1.4 : Invalid Font definition, CRQQXS+OptimaLTStd: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, MVVAWX+InsigniaLTStd: The Charset entry is missing for the Type1 Subset
3.1.4 : Invalid Font definition, YIWFBD+OptimaLTStd-Bold: The Charset entry is missing for the Type1 Subset
3.1.11 : Invalid Font definition, JYHLHF+OptimaLTStd: The CIDSet entry is missing for the Composite Subset
3.1.9 : Invalid Font definition, LDXBBC+OptimaLTStd: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, LDXBBC+OptimaLTStd: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, LDXBBC+OptimaLTStd: The FontFile can't be read
3.1.9 : Invalid Font definition, FSNSYC+OptimaLTStd: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, FSNSYC+OptimaLTStd: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, FSNSYC+OptimaLTStd: The FontFile can't be read
3.1.9 : Invalid Font definition, LVYKUL+InsigniaLTStd: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, LVYKUL+InsigniaLTStd: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, LVYKUL+InsigniaLTStd: The FontFile can't be read
3.1.9 : Invalid Font definition, FUYTUP+OptimaLTStd-Italic: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, FUYTUP+OptimaLTStd-Italic: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, FUYTUP+OptimaLTStd-Italic: The FontFile can't be read
3.1.9 : Invalid Font definition, GZVYQO+OptimaLTStd-Bold: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, GZVYQO+OptimaLTStd-Bold: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, GZVYQO+OptimaLTStd-Bold: The FontFile can't be read
3.1.9 : Invalid Font definition, GWNIWZ+HelveticaLTStd-Roman: mandatory CIDToGIDMap missing
3.1.11 : Invalid Font definition, GWNIWZ+HelveticaLTStd-Roman: The CIDSet entry is missing for the Composite Subset
3.2.3 : Font damaged, GWNIWZ+HelveticaLTStd-Roman: The FontFile can't be read
7.1 : Error on MetaData, Metadata is not a stream

Which also corroborate to execution warnings

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
WARNING: Could not read embedded OTF for font GWNIWZ+HelveticaLTStd-Roman
java.io.IOException: head is mandatory
    at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:182)
    at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:150)
    at org.apache.fontbox.ttf.OTFParser.parse(OTFParser.java:79)
    at org.apache.fontbox.ttf.OTFParser.parse(OTFParser.java:27)
    at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:106)
    at org.apache.fontbox.ttf.OTFParser.parse(OTFParser.java:73)
    at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:114)
    at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:67)
    at org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:138)
    at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:88)
    at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:96)
    at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:143)
    at org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:66)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:849)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:495)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:469)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:142)
    at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:264)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:338)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:259)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:245)

Additional truncated messages

May 26, 2023 12:40:00 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
WARNING: Could not read embedded OTF for font UBAPGG+OptimaLTStd
java.io.IOException: head is mandatory

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
WARNING: Could not read embedded OTF for font GZVYQO+OptimaLTStd-Bold
java.io.IOException: head is mandatory

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
WARNING: Could not read embedded OTF for font FUYTUP+OptimaLTStd-Italic
java.io.IOException: head is mandatory

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 <init>
WARNING: Could not read embedded OTF for font FSNSYC+OptimaLTStd
java.io.IOException: head is mandatory

Although fallback fonts seen to be used they don't work either.

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.font.PDCIDFontType2 findFontOrSubstitute WARNING: Using fallback font LiberationSans for CID-keyed TrueType font GWNIWZ+HelveticaLTStd-Roman

I also see warning messages as below, unsure how to process / address.

May 26, 2023 12:40:01 PM org.apache.pdfbox.pdmodel.graphics.color.PDICCBased ensureDisplayProfile WARNING: ICC profile is Perceptual, ignoring, treating as Display class

Need multiple assistance.

Question 1: How do I add a font?

int position = 0;
PDPage page = getDocument().getPage(position);
PDResources resources = page.getResources();
OTFParser otfParser = new OTFParser();
OpenTypeFont otf = otfParser.parse(new File("OptimaLTStd.otf"));
PDFont font = PDType0Font.load(document, otf, false);

resources.add(font);
page.setResources(resources);
if (position == 0) {
   getDocument().getPages().remove(page);
   getDocument().getPages().add(page);
   setDocument(getDocument());
   setPdfRenderer(getDocument());
} else {
   PDPage prevPage = getDocument().getPage(position - 1);
   getDocument().getPages().insertBefore(page, prevPage);
   setDocument(getDocument());
   setPdfRenderer(getDocument());           }

Question 2: Do we have an override in pdfrender to skip glyph processing so that font related issues do not impact image generation ?

Upvotes: 1

Views: 888

Answers (1)

Tilman Hausherr
Tilman Hausherr

Reputation: 18861

The problem of the missing text is caused by 0 width definitions for the fonts in the PDF, which incorrectly influences a "stretching" algorithm hen rendering. This has been fixed in ticket PDFBOX-5611 and will be in the version 2.0.29. Until then, a snapshot build will be available.

Upvotes: 0

Related Questions