timguy
timguy

Reputation: 2612

Tess4j - Pdf to Tiff to tesseract - "Warning: Invalid resolution 0 dpi. Using 70 instead."

I am usig tess4j (net.sourceforge.tess4j:tess4j:4.4.0) and try OCR on pdf files. So as I understood I have to transform the pdf first to tiff or png (any of those suggested?) what I did like this:

tesseract.doOCR(PdfUtilities.convertPdf2Tiff(inputPdfFile)); 

and get following warning:

Warning: Invalid resolution 0 dpi. Using 70 instead.

Question

Upvotes: 3

Views: 7521

Answers (3)

Vlad-Florin Ciocan
Vlad-Florin Ciocan

Reputation: 1

The default resolution is not set.

To complement nguyenq `s answer :

instance.setVariable("user_defined_dpi", "300");

Upvotes: 0

nguyenq
nguyenq

Reputation: 8365

If no resolution information is in image metadata, Tesseract tries to estimate the resolution by itself so that font size information can be calculated in results.

You can try the following APIs to set input image resolution:

instance.setVariable("user_defined_dpi", "300");

or

TessBaseAPISetSourceResolution(TessBaseAPI handle, int ppi);

You can suppress console output by:

instance.setVariable("debug_file", "/dev/null");

Upvotes: 7

David James
David James

Reputation: 11

In version 5.4.0 of tess4j,

instance.setVariable("user_defined_dpi", "300");

instead of

instance.SetTessVariable("user_defined_dpi", "300");

Upvotes: 0

Related Questions