Reputation: 363
I am trying to develop an OCR app for mobile.
So before passing it to OCR engine i am applying some filters and binarizing the image for better results.
I am using adaptive gaussian threshold which gives me pretty nice results but along with some dots and noise around the text (as you can observe in image below) which leads to error in OCR output.
Now this is just a small segment of larger image.
The reason i have understood is because this white outline around the text which can be observed only when i zoom the image significantly:
which i try to minimize by applying gaussian blurring before binarizing it. But still i believe i can get better results if i can understand and eliminate the cause of that white outline around the text.
I am also adding the images I have explained the details of image in their name.
I am getting good results, but just trying to get some more insights and trying to explore if there is any other or better way of achieving the same.
Any guidance or direction would be of great help. I hope i am clear with my question. Feel free to ask any details.
Thank you.
Upvotes: 3
Views: 719
Reputation: 625
Have you tried morphological operations? This will decrease the white shade, provided you choose an optimum filter size and shape (circular disc operator).
It would be more useful if you can mention the sequence of operations you are performing on your image to see at what stage you are getting the white shade.
I think dilation will help here. MATLAB accepts gray image for dilation and does a wonderful job. Try it with OpenCV. I had done it earlier.
What type of binary thresholding technique are you using?
Upvotes: 0
Reputation: 1270
Since you are going to implement for mobile, what about just convert it to a binary image (just used Matlab to show).
img = imread('OGGjn.png');
imgb = im2bw(img);
imshow(imgb);
Output:
Upvotes: 2