Recognize coloured text with Tesseract (Tess4J)

Question

Im working with text recognition with Java and Tess4J. Im recognizing black and red images (separately), both with white background, very clear. With the black ones, it works perfectly, but with the red ones, tesseract just goes crazy. I tried adding the variable ("editor_image_text_color", "RED"), but it does not help at all. Right now, what I do for the red ones is scan the whole image and set every red pixel to black, which I find very inefficient, because I need a few calculations for each pixel, due to the pictures have different intensity of red that I have to conserve. Thanks a lot!

For instance: http://imageshack.us/photo/my-images/593/3eu9.png/ does always give me a 9, but http://imageshack.us/photo/my-images/818/efxf.png/ does not, is like if it were losing the number in the preprocessing, because the black ones work extremely well, but the red ones arent any better than a random number.

nguyenq · Accepted Answer

Try to convert the colored image to grayscale using ImageHelper.convertImageToGrayscale(BufferedImage image) method.

Recognize coloured text with Tesseract (Tess4J)

Answers (2)

Related Questions