Reputation: 1400
When extracting text form pdf using itext 5.3.4 using this code:
try {
reader = new PdfReader(thepdffilename);
} catch (IOException e) {
openok=false;
}
if (openok==true){
int numberOfPages = reader.getNumberOfPages();
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
for (int page = 1; page <= numberOfPages; page++){
try {
SimpleTextExtractionStrategy strategy = parser.processContent(page, new SimpleTextExtractionStrategy());
content = content + strategy.getResultantText();
} catch (Throwable t) {
crap=true;
break;
}
}
reader.close();
}
However occasionally GooglePlay crashes & ANRs reports that there has been a NP exception in itext.
java.lang.NullPointerException in com.itextpdf.text.pdf.PdfReader$PageRefs.readPages at
com.itextpdf.text.pdf.PdfReader$PageRefs.readPages(PdfReader.java:3382) at
com.itextpdf.text.pdf.PdfReader$PageRefs.<init>(PdfReader.java:3350) at com.itextpdf.text.pdf.PdfReader$PageRefs.<init>(PdfReader.java:3328) at
com.itextpdf.text.pdf.PdfReader.readPages(PdfReader.java:1003) at com.itextpdf.text.pdf.PdfReader.readPdf(PdfReader.java:530) at
com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:170) at
com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:159)
The 5.3.4 source code at line 3382 is:
3374 void readPages() throws IOException {
3375 if (refsn != null)
3376 return;
3377 refsp = null;
3378 refsn = new ArrayList<PRIndirectReference>();
3379 pageInh = new ArrayList<PdfDictionary>();
3380 iteratePages((PRIndirectReference)reader.catalog.get(PdfName.PAGES));
3381 pageInh = null;
3382 reader.rootPages.put(PdfName.COUNT, new PdfNumber(refsn.size()));
3383 }
3384
3385 void reReadPages() throws IOException {
3386 refsn = null;
3387 readPages();
3388 }
So something is going wrong when certain pdf files are having their text extracted and the reason why that could be happening is probably never going to be sorted as I do not have the pdfs in question.
What I require is a method of catching the NP exception so my app does not crash.
I've tried
} catch (Exception e) {
and as a last resort to try and catch any exception
} catch (Throwable t) {
Does anyone have an idea how I can get this particular itext error to be caught?
thanks
Upvotes: 0
Views: 2850
Reputation: 95888
If I understand you correctly, your attempts to catch that NPE have been made in your loop through the document pages:
for (int page = 1; page <= numberOfPages; page++){
try {
SimpleTextExtractionStrategy strategy =
parser.processContent(page, new SimpleTextExtractionStrategy());
content = content + strategy.getResultantText();
} catch (Throwable t) {
crap=true;
break;
}
}
If you look closely at your Exception, though...
java.lang.NullPointerException in com.itextpdf.text.pdf.PdfReader$PageRefs.readPages at
com.itextpdf.text.pdf.PdfReader$PageRefs.readPages(PdfReader.java:3382) at
[...]
com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:159)
you see that the exception already occurs in the PdfReader construction (PdfReader.<init>
). Thus, you have to catch the NPE already where you construct your PdfReader:
try {
reader = new PdfReader(thepdffilename);
} catch (IOException e) {
openok=false;
} catch (NullPointerException npe) { // !!
openok=false; // !!
}
Or if you want to take no chances
try {
reader = new PdfReader(thepdffilename);
} catch (Throwable t) { // !!
openok=false;
}
If you have other code locations, too, in which a PdfReader
is constructed, you may want to harden them, too.
@BrunoLowagie This NPE had better be transformed to a tagged exeption, hadn't it?
Upvotes: 3
Reputation: 1201
This is ugly but if you really want to catch it , try and catch RuntimeException
Upvotes: 0