Reputation: 886
I have used tesseract 3.04 with python and pytesseract(from Pypi) now I want to use the new LSTM based 4.00.00alpha
I'm using kali linux so i installed libtesserct4(using apt-get) it created its folder named 4.00 in tesseract-ocr but when I try to use it with pytesseract it does not recognize --eom input
the code is:
pytesseract.image_to_string(Image.open(filename),lang="en",config='--eom 2')
Result:
read_params_file: Can't open 1
oem input does not also appear when I use tesseract -h command.
It does not recognize training data files in folder tesseract-ocr/4.00/tessdata it only recognize training data in the folder tesseract-ocr/tessdata
If there is any problem with pytesseract could you please tell me how to setup a python wrapper for tesseract 4
Thanks
Upvotes: 2
Views: 6823
Reputation: 8626
You may try below. It works for Tesseract 4.0.0a
with Python 3.6
.
ocr = pytesseract.image_to_string(Image.open(filename), lang="eng",\
boxes=False, config="--psm 3 --oem 2")
--psm 3
is the default Page Segmentation Mode.
Hope this help.
Upvotes: 0