Aki24x
Aki24x

Reputation: 1068

Splitting text and background as preprocess of OCR (Tesseract)

I am applying OCR against text in TV footage. (I am using Tesseact 3.x w/ C++) I am trying to split text and background part as a preprocessing of OCR.

With usual footage, text and background is highly contrasted (such as white against black) so that modifying gamma would do the job. However, this attached image (yellow text with background of orange/red sky) is giving me hard time to do preprocessing.

Yellow-text over orange sky

What would be a good way to split this yellow text from background?

Upvotes: 0

Views: 2360

Answers (1)

thewaywewere
thewaywewere

Reputation: 8626

Below is a simple solution by using Python 2.7, OpenCV 3.2.0 and Tesseract 4.0.0a. Convert Python to C++ for OpenCV should be not difficult, then call tesseract API to perform OCR.

import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline 

def show(title, img, color=True):
    if color:
        plt.imshow(img[:,:,::-1]), plt.title(title), plt.show()
    else:
        plt.imshow(img, cmap='gray'), plt.title(title), plt.show()

def ocr(img):
    # I used a version of OpenCV with Tesseract binding. Modes set to:
    #   Page Segmentation mode (PSmode) = 11 (defualt = 3)
    #   OCR Enginer Mode (OEM) = 3 (defualt = 3)
    tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng', \
                                          'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',3,3)
    retval = tesser.run(img, 0) # return text string type
    print 'OCR Output: ' + retval

img = cv2.imread('./imagesStackoverflow/yellow_text.png')
show('original', img)

# apply GaussianBlur to smooth image, then threshholds yellow to white (255,255, 255)
# and sets the rest to black(0,0,0)
img = cv2.GaussianBlur(img,(5,5), 1) # smooth image
mask = cv2.inRange(img,(40,180,200),(70,220,240)) # filter out yellow color range, low and high range
show('mask', mask, False)

# invert the image to have text black-in-white
res = 255 - mask
show('result', res, False)

# pass to tesseract to perform OCR
ocr(res)

Processed Images and OCR Output (see last line in image):

enter image description here

Hope this help.

Upvotes: 2

Related Questions