rhinds
rhinds

Reputation: 10043

PDFBox NPE loading fonts

I am using PDFBox to extract text from several PDF docs, and whilst running my unit test suite (via gradle) I am getting intermittent failures caused by a NullPointerException - my base assumption now being that it is caused by multiple threads attempting to load the font into the font dictionanry cache at the same time.

I know, as is stated in the FAQs, that PDFBox is not threadsafe - but the impression I have got from that and this discussion here, is that relates specifically to multiple threads accessing a document at the same time, and the comment appears to suggest that the fontbox cache is expected to be threadsafe.

The exception I am getting in my unit test is:

WARNING: Using fallback font 'LiberationSans-Bold' for 'Arial-BoldItalicMT'
  java.lang.NullPointerException:
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFont(FontMapperImpl.java:463)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:417)
  at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getTrueTypeFont(FontMapperImpl.java:321)
  at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.<init>(PDTrueTypeFont.java:198)
  at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:75)
  at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
  at org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
  at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:815)
  at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:472)
  at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:446)
  ...
Oct 03, 2016 12:21:24 PM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont <init>
WARNING: Using fallback font 'LiberationSans-Bold' for 'Arial-BoldMT'
Oct 03, 2016 12:21:24 PM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont <init>

I am using PDFBox version 2.0.2

Anyone come across this before?

Upvotes: 0

Views: 1407

Answers (1)

rhinds
rhinds

Reputation: 10043

This has been fixed in the PDFBox library from version 2.0.4

Details in the original ticket here: https://issues.apache.org/jira/browse/PDFBOX-3521

Upvotes: 2

Related Questions