sbouba
sbouba

Reputation: 337

tesseract ocr command line for signe character

I'm trying run tesseract-ocr over this image, unsuccessfully :

6

> wget https://i.sstatic.net/rXR44.png
...
> convert dOtlrvx.png dOtlrvx.tif
> tesseract dOtlrvx.tif out -psm 10 && cat out.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica
Page 0
.

The recognized char is a dot "."

-psm 10 stands for "treat the image as a single character" so I think its the correct option to use. I also tried with other psm possible values, it does not work neither.

Anyone has an idea why is this not working ? Any suggestion is welcomed !

Thanks

Upvotes: 1

Views: 608

Answers (1)

cortex42
cortex42

Reputation: 234

Create a new config file for tesseract, add this line tessedit_char_whitelist 0123456789 and then process your image: tesseract dOtlrvx.tif out -psm 10 your_config_file.

This worked for me.

Upvotes: 1

Related Questions