Reputation: 61
I'm trying to use the image_to_string function from pytesseract but can't get to do that. I've already installed the pytesseract module and the tesseract module but this last one won't seem to work, I have the following code
import argparse
import cv2
import os
import time
import sys
from PIL import Image
import pytesseract
A=Image.open("C:/Users/Martin/Python/Python36/Tickets/2.jpg")
pytesseract.image_to_string(A)
When I run this I get thefollowing error message
Traceback (most recent call last):
File "C:/Users/Martin/Python/Python36/cosa.py", line 9, in <module>
pytesseract.image_to_string(A)
File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 193, in image_to_string
return run_and_get_output(image, 'txt', lang, config, nice)
File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 140, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\Martin\Python\Python36\lib\site-packages\pytesseract\pytesseract.py", line 111, in run_tesseract
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Users\Martin\Python\Python36\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\Martin\Python\Python36\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] El sistema no puede encontrar el archivo especificado
So I tried to run import tesseract and this shows up
Traceback (most recent call last):
File "<pyshell#53>", line 1, in <module>
import tesseract
File "C:\Users\Martin\Python\Python36\lib\site-packages\tesseract\__init__.py", line 34
print 'Creating user config file: {}'.format(_config_file_usr)
^
SyntaxError: invalid syntax
I guess it's a compatibility problem (I'm using Python 3.6.5 and print is now a function so () is expected) but when i run pip install --upgrade tesseract I get that it's already up to date so I don't know how to make this work. I'm working with Windows 7 64bits. Any help greatly appreciated.
Upvotes: 0
Views: 4835
Reputation: 1
The ocr needs to be installed separately from the python package from pip:
sudo apt install tesseract-ocr
Upvotes: 0
Reputation: 9441
Not entirely sure if this solves your problem because it's windows and error is not English, but for other googlers, if you encounter
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
The ocr needs to be installed separately from the python package from pip:
sudo apt install tesseract-ocr
Will install it into your path.
Upvotes: 0
Reputation: 804
In your system there's no Tesseract installed.
The package tesseract
that you have installed with pip
is another Python package which is not correlated to the Tesseract OCR engine.
You have to install Tesseract following this instructions. Then you can use pytesseract
Upvotes: 3