Quentin M
Quentin M

Reputation: 181

Improve text reading from image

I am trying to read movie credits from a movie. To make a MVP I started with a picture:enter image description here

I use this code:

print(pytesseract.image_to_string(cv2.imread('frames/frame_144889.jpg')))

I tried different psm but it return an ugly text.

one Swimmer
Decay
Nurse
Aer
a
ig
coy
Coy
cor
ag
Or
Rr
Sa
Ae
Red
cod
Reng
OED Ty
Ryan Stunt Double
UST
er ey a er
Pm
JESSICA NAPIER
ALEX MALONE
Ey
DAMIEN STROUTHOS
JESSE ROWLES
DARIUS WILLIAMS
beamed
Aya
GEORGE HOUVARDAS
Sih
ata ARS Vara
BES liv4
MIKE DUNCAN
Pe
OV TN Ia
Ale Tate
SUV (aa: ae
SU aa
AIDEN GILLETT
MARK DUNCAN.

I tried with other picture with bigger resolution with better result but I which to be able to enable non HD movie.

What could I do to improve the precision of the reading ?

Regards Quentin

Upvotes: 1

Views: 225

Answers (1)

Flippi96
Flippi96

Reputation: 71

I achieve good results very often just following this guideline to improve Tesseract accuracy: Tesseract - Improving the quality of the output

Important things to do are:

  • Use white for the background and black for characters font color.
  • Select desired tesseractpsm mode. In this case, use psm mode 6 to treat image as a single uniform block of text.
  • Use tessedit_char_whitelist config to specify only the characters that you are sarching for. In this case, all minor and major characters of english alphabeth.

Here is the code:

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
img = cv2.imread('a.jpg')
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist= ABCDEFGHIJKLMNOabcdefghijklmnopqrstuvwxyz --psm 6")
originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)

text = []
for z, a in enumerate(data.splitlines()):
    if z != 0:
        a = a.split()
        if len(a) == 12:
            x, y = int(a[6]), int(a[7])
            w, h = int(a[8]), int(a[9])
            cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
            cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
            text.append(a[11]);

print("Text result: \n", text)
cv2.imshow('Image result', originalImage)
cv2.waitKey(0)

And the image with the expected result:

enter image description here

Upvotes: 1

Related Questions