Reputation: 1794
Im working with text recognition with Java and Tess4J. Im recognizing black and red images (separately), both with white background, very clear. With the black ones, it works perfectly, but with the red ones, tesseract just goes crazy. I tried adding the variable ("editor_image_text_color", "RED"), but it does not help at all. Right now, what I do for the red ones is scan the whole image and set every red pixel to black, which I find very inefficient, because I need a few calculations for each pixel, due to the pictures have different intensity of red that I have to conserve. Thanks a lot!
For instance: http://imageshack.us/photo/my-images/593/3eu9.png/ does always give me a 9, but http://imageshack.us/photo/my-images/818/efxf.png/ does not, is like if it were losing the number in the preprocessing, because the black ones work extremely well, but the red ones arent any better than a random number.
Upvotes: 1
Views: 3876
Reputation: 1794
Thanks for the answer nguyenq, I tried that function and it didn't work very well, but after checking out that ImageHelper class, I used the method:
ImageHelper.convertImageToBinary(BufferedImage image)
and it works quite well, thanks!
Upvotes: 1
Reputation: 8345
Try to convert the colored image to grayscale using ImageHelper.convertImageToGrayscale(BufferedImage image)
method.
Upvotes: 2