Aimedk
Aimedk

Reputation: 21

image to osd tesseract error using python 3.6

I m trying to use image_to_osd function of tesseract but I got this error for python 3.6, but when I test the same script in an other environment with python 3.8 it works !!, is there any configuration for python 3.6 or anything to do ?

angle_rotated_image = re.search('(?<=Rotate: )\d+',pytesseract.image_to_osd(rotated)).group(0)

error:

     angle_rotated_image = re.search('(?<=Rotate: )\d+',pytesseract.image_to_osd(rotated)).group(0)
  File "C:\Users\username\AppData\Roaming\Python\Python36\site-packages\pytesseract\pytesseract.py", line 543, in image_to_osd
    }[output_type]()
  File "C:\Users\username\AppData\Roaming\Python\Python36\site-packages\pytesseract\pytesseract.py", line 542, in <lambda>
    Output.STRING: lambda: run_and_get_output(*args),
  File "C:\Users\username\AppData\Roaming\Python\Python36\site-packages\pytesseract\pytesseract.py", line 287, in run_and_get_output
    run_tesseract(**kwargs)
  File "C:\Users\username\AppData\Roaming\Python\Python36\site-packages\pytesseract\pytesseract.py", line 263, in run_tesseract
    raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v5.0.0.20190623 with Leptonica Warning: Invalid resolution 0 dpi. Using 70 instead. Estimating resolution as 163 Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')

Upvotes: 2

Views: 1052

Answers (1)

SocraticDatum
SocraticDatum

Reputation: 379

I ran into a similar problem when trying to determine rotation of a given document and trying to use pytesseract's image_to_osd(). It was working fine for me on MacOS with tesseract 4.1.1, but it wouldn't work on Windows with tesseract 5.0.0-alpha. After reading through many threads on the topic related to the OP's error and trying various things like passing --dpi and -c min_chararacters_to_try= with no success, I finally tried using a different version of tesseract on Windows, which finally solved my problem.

Status of image_to_osd():

  • (PASS) OS MacOS; tesseract 4.1.1; pytesseract 0.3.0; Python 3.6.5
  • (PASS) OS Windows; tesseract 4.1.0; pytesseract 0.3.0; Python 3.6.5
  • (FAIL) OS Windows; tesseract 5.0.0; pytesseract 0.3.0; Python 3.6.5

I think pytesseract 0.3.7 will probably work too, but I didn't test it.

Note that you can still get OP's error with this, but from what I tested it's much more reasonable now, e.g., with blank pages.

Upvotes: 2

Related Questions