tesseract OCR - Q detected as O

Question

I am developing an application to read an identification badge using OpenCV and tesseract as OCR engine. I wrote an algorithm using OpenCV which handles with the text detection in order to get a clear and "easy-to-read" image for my OCR engine. I add an image below to ilustrate what I get:

When I ask tesseract to "read" the image, I get "KO 978"... Searching for this "O/Q problem" with tesseract, I found only this post https://groups.google.com/forum/#!topic/tesseract-issues/kEDIIpQ-9W4, but here, it seems that the is that the input image for tesseract is not preprocessed clearly (the reponse is that the image was not deskewed)...

Based on the wiki section at github, I followed all the step of the Improve Quality (and I think that the image is clear enought), so I do not know what else I can do... I do not know if training the OCR will help, but if it is possible, I want to avoid doing this beacuse of the hard work and because is not recommended in the documentation.

I am using tesseract v3.03 in console, not integrated in my app (so the tessarct make a preprocess of the input image).

Any idea of how to solve this? Thanks!

tesseract OCR - Q detected as O

Answers (1)

Related Questions