Reputation: 91
I got the following error when I tried to find out the Chinese words in a picture by python: (By the way, I had already had "chi_sim.traineddata" training file in tessdata directory and got a successful try to find out English sentences in a picture, so this error really confused me.)
*C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\python.exe E:/PKU1.3/python_math/set_for_recognition.py
Traceback (most recent call last):
File "E:/PKU1.3/python_math/set_for_recognition.py", line 5, in <module>
text=pytesseract.image_to_string(Image.open('climb_high.jpeg'),lang='chi_sim')
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 295, in image_to_string
return run_and_get_output(*args)
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 203, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytesseract\pytesseract.py", line 179, in run_tesseract
raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (3221225477, '')*
Upvotes: 3
Views: 7960
Reputation: 11
Please try the below code :
import pytesseract
from PIL import Image
pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files/Tesseract-OCR/tesseract.exe'
tessdata_dir_config = '--tessdata-dir "C:/Program Files/Tesseract-OCR/tessdata"'
img = Image.open('images\Capture2.JPG')
text = pytesseract.image_to_string(img, config=tessdata_dir_config)
print(text)
Upvotes: 1
Reputation: 7948
I think this problem is TRAINEDDATA
that raised.
I used to develop the OCR project with TESSERACT on windows 7.
Now, I change to windows 10. I get this problem.
but, I found this issue is related to your TRAINEDDATA
,
If I use TRAINEDDATA that I have trained on windows 7, then it fine without any error message.
Upvotes: 1
Reputation: 1271
I got this error because my UZN file extended beyond the image area. I patched pytesseract.py (print(' '.join(cmd_args))
in run_tesseract()
) which was throwing an assertion error.
Upvotes: 0
Reputation: 1
Actually since the error code 3221225477 --> 0xC0000005 : ACCESS_VIOLATION means Tesseract has crashed (from here), change a version of Tesseract may help you.
In 4.00 (beta) and 3.02 this problem is occurred, 3.05 is fine (I use Windows 7).
Hope this helps.
Upvotes: 0