Reputation: 14625
I'm using Tesseract ORC library to extract text from images taken on screens. Problem is that most modern cameras also captures the pixel on a display while taking a photo.
Is there anyway to apply like a filter or threasholding to the bitmap to "extract" the text to a clearer one for better results with tesseract?
Se example, before processing:
After processing (threshold effect in photoshop):
Upvotes: 3
Views: 841
Reputation: 22342
Tesseract has a built-in threshold method, TessBaseAPI#ThresholdRect
. Have you tried that? If so, what problems did you have with it?
If it didn't work so well on some pictures, you may want to try looking up some "mean" or "adaptive" threshold algorithms, since it looks like Tesseract's is a straight threshold, so it may not adapt well to darker/lighter images without some tweaking.
Upvotes: 2