Reputation: 2325
I'd like to be able to highlight a word in an image of a document when the user searches for that word. Exactly like Google Books does here.
As far as I know, Tesseract and other open source OCR programs don't support this sort of function, so does anyone have any ideas how it might be done?
Upvotes: 2
Views: 15483
Reputation: 37898
Yes they "support" it. Sort of.
They give you a rectangle that tells you where the word is. Using that, fill said rectangle with the color of your choice on the image using a color blending mode (e.g., keep the luma intact and just alter the chroma). This works well with B/W and grayscale images, which most books are, and is sufficient for most colored fonts too (except those in a colored background). A solution to this is to invert the colors instead of highlighting them, this is done in many applications (Foxit Reader comes to mind).
Upvotes: 2