GoodJuJu
GoodJuJu

Reputation: 1570

Using Tesseract to OCR matchTemplate Regions of Interest (ROI)

This is my very first attempt at using Python. I normally use .NET, but to identify shapes in documents have turned to Python and OpenCV for image processing.

I am using OpenCV TemplateMatching (cv2.matchTemplate) to discover Regions of Interest (ROI) in my documents.

This works well. The template matches the ROI's and rectangles are placed, identifying the matches.

The ROI's in my images contain text which I also need to OCR and extract. I am trying to do this with Tesseract, but I think I am approaching it wrongly, based upon my results.

My process is this:

In the image below, you can see the matched regions (which are fine), but you can see that the text in the ROI doesn't match the text from tesseract (bottom right of ROI).

Please could someone take a look and advise where I am going wrong?

import cv2
import numpy as np
import pytesseract
import imutils

img_rgb = cv2.imread('images/pd2.png')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

template = cv2.imread('images/matchMe.png', 0)
w, h = template.shape[::-1]

res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.45
loc = np.where(res >= threshold)
for pt in zip(*loc[::-1]):
    cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
    roi = img_rgb[pt, (pt[0] + w, pt[1] + h)]
    config = "-l eng --oem 1 --psm 7"
    text = pytesseract.image_to_string(roi, config=config)
    print(text)
    cv2.putText(img_rgb, text, (pt[0] + w, pt[1] + h),
                cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 0, 255), 3)

cv2.imwrite('images/results.png', img_rgb)

Upvotes: 1

Views: 5708

Answers (1)

Knight Forked
Knight Forked

Reputation: 1619

There were two issues in your code: 1. You were modifying image (drawing rect) before OCR. 2. roi was not properly constructed.

img_rgb = cv2.imread('tess.png')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

template = cv2.imread('matchMe.png', 0)
w, h = template.shape[::-1]

res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.45
loc = np.where(res >= threshold)
for pt in zip(*loc[::-1]):
    roi = img_rgb[pt[1]:pt[1] + h, pt[0]: pt[0] + w]
    config = "-l eng --oem 1 --psm 7"
    text = pytesseract.image_to_string(roi, config=config)
    print(text)
    cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
    cv2.putText(img_rgb, text, (pt[0] + w, pt[1] + h),
                cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 0, 255), 3)

cv2.imwrite('results.png', img_rgb)

You might still have to feed tesseract even properly filtered image for any meaningful recognition. Hope this helps.

Upvotes: 1

Related Questions