Teja
Teja

Reputation: 967

Python OCR Tesseract cannot recognize Single Characters

I have two TIF images. First Image(a.tif) is:

Single Character Image

and Second Image(bcd.tif) is

Multiple Character Image

When I am using "tesseract a.tif a.txt" it is not reading that Character and The same command "tesseract bcd.tif bcd.txt" is working.I have seen some answers in stackoverflow they they didn't gave solution how to run that.If we need to add any parameters what are those?

Upvotes: 2

Views: 3763

Answers (2)

hazcoper
hazcoper

Reputation: 41

as you said you need to change the mode to single character mode, you can do that in python by using the following command

pytesseract.image_to_string(img_path , config="--psm 10") 

Upvotes: 4

Nisarg Shah
Nisarg Shah

Reputation: 14531

Seems like the issue has something to do with there being only a single character in the image. For instance I tried these two images:

This one works fine. Tesseract reports 95% confidence in the result:

enter image description here

This one doesn't work.

enter image description here

I also tried scanning that image with PageSegMode set to SingleChar, and then it is scanned fine.

The command line argument for that should be -psm 10. See this: https://stackoverflow.com/a/26418458/5894241

Upvotes: 1

Related Questions