skt7
skt7

Reputation: 1235

Tesseract 4 couldn't load any languages when used with OCR Engine mode - "Legacy + LSTM engines" (--oem 2)

I think this issue is only related to Tesseract 4 which comes with LSTM support. As I am using a 64-bit Windows System, I have downloaded 64-bit windows executable from here - https://github.com/UB-Mannheim/tesseract/wiki

It has the following OCR Engine modes:

It works with all the modes except 2.


When run with --oem 1

tesseract --oem 1 1.jpg 1

Result:

Tesseract Open Source OCR Engine v4.0.0.20190314 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 561
Detected 5 diacritics

and creates a file 1.txt with corresponding OCR result.


When run with --oem 2

tesseract --oem 2 1.jpg 1

Result:

Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

and no output is generated.


I thought the error will be with language installation but

tesseract --list-langs

which gave me the following result

List of available languages (2):
eng
osd

I even manually checked the tessdata folder, here is the screenshot of the same

enter image description here

which clearly states I already have eng language.

Can anyone help me with the exact problem that is disallowing me use Legacy + LSTM engines (--oem 2) mode.

Upvotes: 4

Views: 12195

Answers (1)

user898678
user898678

Reputation: 3328

Yes, you have eng language, but with LSTM support only. If you want to have LSTM&Legacy support you need to download data from tessdata repository

Upvotes: 12

Related Questions