Improve text reading from image

Question

I am trying to read movie credits from a movie. To make a MVP I started with a picture:

I use this code:

print(pytesseract.image_to_string(cv2.imread('frames/frame_144889.jpg')))

I tried different psm but it return an ugly text.

one Swimmer
Decay
Nurse
Aer
a
ig
coy
Coy
cor
ag
Or
Rr
Sa
Ae
Red
cod
Reng
OED Ty
Ryan Stunt Double
UST
er ey a er
Pm
JESSICA NAPIER
ALEX MALONE
Ey
DAMIEN STROUTHOS
JESSE ROWLES
DARIUS WILLIAMS
beamed
Aya
GEORGE HOUVARDAS
Sih
ata ARS Vara
BES liv4
MIKE DUNCAN
Pe
OV TN Ia
Ale Tate
SUV (aa: ae
SU aa
AIDEN GILLETT
MARK DUNCAN.

I tried with other picture with bigger resolution with better result but I which to be able to enable non HD movie.

What could I do to improve the precision of the reading ?

Regards Quentin

Flippi96 · Accepted Answer

I achieve good results very often just following this guideline to improve Tesseract accuracy: Tesseract - Improving the quality of the output

Important things to do are:

Use white for the background and black for characters font color.
Select desired tesseractpsm mode. In this case, use psm mode 6 to treat image as a single uniform block of text.
Use tessedit_char_whitelist config to specify only the characters that you are sarching for. In this case, all minor and major characters of english alphabeth.

Here is the code:

import cv2
import numpy as np
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR	esseract'
img = cv2.imread('a.jpg')
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist= ABCDEFGHIJKLMNOabcdefghijklmnopqrstuvwxyz --psm 6")
originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)

text = []
for z, a in enumerate(data.splitlines()):
    if z != 0:
        a = a.split()
        if len(a) == 12:
            x, y = int(a[6]), int(a[7])
            w, h = int(a[8]), int(a[9])
            cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
            cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
            text.append(a[11]);

print("Text result: 
", text)
cv2.imshow('Image result', originalImage)
cv2.waitKey(0)

And the image with the expected result:

Improve text reading from image

Answers (1)

Related Questions