Reputation: 161
The images that I have gives me inconsistent results. My thought process is: my text is always in white font; if I can switch the pixel of my text to black and turned everything else to white or transparent, I will have better success.
My question is, what library or language is best for this? Do I have to turn my white pixel into some unique RGB, turn everything else to white or transparent, then find the unique RGB and make that black? Any help is appreciated.
Upvotes: 0
Views: 578
Reputation: 111
Yes, if you could make the text pixels black and all the rest of the documents white you would have better success, although this is not always possible, there are processes that can help.
The median filter (and other low pass filters) can be used to remove noise present in the image.
erosion can also help to remove things that are not characters, like thin lines and also noise.
align the text is also a good idea, the OCR accuracy can drop considerably if the text is not aligned. To do this you could try the Hough transform followed by a rotation. Use the Hough transform to find a line in your text and then rotate the image in the same angle as the line.
All processing steps mentioned can be done with opencv or scikit-image.
Is also good to point out that there are many other ways to process text, too many to mention.
Upvotes: 1