Rohit Suri
Rohit Suri

Reputation: 45

Detect text using Tesseract in complex backgrounds in OpenCV

I want to detect text in complex backgrounds such as this one using Tesseract in OpenCV. How should I go about this task?

Old Airtel Logo

Upvotes: 1

Views: 1729

Answers (1)

John Morris
John Morris

Reputation: 416

See the answer to Detect white characters on black background using Tesseract. The answer references a paper describing how to recognize text in a background independent manner.

The algorithm described in the paper by T. Kasar, J. Kumar, and A. G. Ramakrishnan and as implemented by Jason Funk consists of multiple stages. In the first stage, one performs edge detection on each channel (R, G, B) using Canny. One then combines the these edges into a grayscale image using bitwise_or.

Next we find contours. The key point is that the contours for letters and their bounding boxes follow certain rules (e.g., contour is closed, aspect ratio is reasonable, the bounding box is not to big (page sized) or small (pixel sized), the number of possible internal edges is known -- 0 for most letters, 1 for "o", 2 for "8" and "B"). So you sift through the bounding boxes and keep those that follow the rules. Each edge, however, generates two contours, one outside and one inside. I'm still not sure I have logic right.

In any case, the boxes you keep surround the letters and their interior spaces. The foreground intensity is just the average intensity you obtain by tracing the outline of the contour associated with the box in the source image. The background intensity is the median intensity you get when sampling the pixels around the four corners of the bounding box in the source image. If foreground intensity is less than background intensity, then foreground color is black; otherwise, foreground color is white. Bear in mind this is done for each bounding box.

So, for each bounding box we color our result based on the comparing the intensity of each pixel in the source image to the foreground intensity for that bounding box. Based on this comparison, the pixel is determined to be part of the foreground or background.

I think the approach is sound, but the details are a little tricky.

Upvotes: 1

Related Questions