Reputation: 10043
I am using PDFBox to extract text from several PDF docs, and whilst running my unit test suite (via gradle) I am getting intermittent failures caused by a NullPointerException - my base assumption now being that it is caused by multiple threads attempting to load the font into the font dictionanry cache at the same time.
I know, as is stated in the FAQs, that PDFBox is not threadsafe - but the impression I have got from that and this discussion here, is that relates specifically to multiple threads accessing a document at the same time, and the comment appears to suggest that the fontbox cache is expected to be threadsafe.
The exception I am getting in my unit test is:
WARNING: Using fallback font 'LiberationSans-Bold' for 'Arial-BoldItalicMT'
java.lang.NullPointerException:
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFont(FontMapperImpl.java:463)
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:417)
at org.apache.pdfbox.pdmodel.font.FontMapperImpl.getTrueTypeFont(FontMapperImpl.java:321)
at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.<init>(PDTrueTypeFont.java:198)
at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:75)
at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
at org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:815)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:472)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:446)
...
Oct 03, 2016 12:21:24 PM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont <init>
WARNING: Using fallback font 'LiberationSans-Bold' for 'Arial-BoldMT'
Oct 03, 2016 12:21:24 PM org.apache.pdfbox.pdmodel.font.PDTrueTypeFont <init>
I am using PDFBox version 2.0.2
Anyone come across this before?
Upvotes: 0
Views: 1407
Reputation: 10043
This has been fixed in the PDFBox library from version 2.0.4
Details in the original ticket here: https://issues.apache.org/jira/browse/PDFBOX-3521
Upvotes: 2