Andresnex
Andresnex

Reputation: 87

pytesseract can't recognice number 1

I'm running a script that gives me back the number and position of the numbers in a Numpad that is disorganized. But when it comes to recognising the 1 it gives me either 71 or 7.

This is the image where im extracting the 1 from.

This is the script I'm running

numero.save(r'C:\imagenes\numeros\numero.png')
image = Image.open(r'C:\imagenes\numeros\numero.png')
inverted_image = PIL.ImageOps.invert(image)
inverted_image.save(r'C:\imagenes\numeros\numero.png')

image = cv2.imread(r'C:\imagenes\numeros\numero.png')

numero = int(pytesseract.image_to_string(image, lang='spa', config='--psm 6 digits'))
print("numero :", numero)

if numero == 7 or numero not in numeros:
     numero_1_eng = int(pytesseract.image_to_string(image, lang='eng', config='--psm 6 digits'))
if numero_eng != 7:
     numero = 1
else:
     numero = numero_eng
print("numero:", numero)

vector = 930, 425, numero
vector_de_vectores.append(vector)

Upvotes: 0

Views: 300

Answers (1)

Ahx
Ahx

Reputation: 8005

Solution


1- Apply adaptive-thresholding

2- Set tesseract configuration to --psm 7 (Since you are trying to recognize a single text line. See all psm modes)


Result of adaptive-thresholding:

enter image description here

When you read:

txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)

Result:

1

Code:


import cv2
import pytesseract

img = cv2.imread("tUh0U.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 252, cv2.ADAPTIVE_THRESH_MEAN_C,
                            cv2.THRESH_BINARY_INV, 31, 61)
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)

Upvotes: 1

Related Questions