Maj Drmaz
Maj Drmaz

Reputation: 11

Why my python tesseract doesen't find anything on this foto?

I want to read numbers on the picture:

enter image description here

import cv2
import pytesseract
import cv2  

pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

img1 = cv2.imread('white.png')


img1 = cv2.resize(img1,(650,600))
img1 = cv2.cvtColor(img1,cv2.COLOR_BGR2GRAY)
print(pytesseract.image_to_boxes(img1))
cv2.imshow('something',img1)
cv2.waitKey(3000)

With this code I don't get any output(except image that pops up on the screen). Whan I try change print(pytesseract.image_to_boxes(img1)) to print(pytesseract.image_to_data(img1)) I get 1 1 0 0 0 0 0 0 650 600 -1 as an output.

Does anyone know why it doesn't work? Thank you for your help.

Upvotes: 1

Views: 270

Answers (1)

Ahx
Ahx

Reputation: 7985

You need to know page-segmentation-modes(psm)

if you set psm to 6:

text = pytesseract.image_to_string(gray_image, config='--psm 6')

Result will be:

5 3 7
6 195
9 8 6
8 6 3
4 8 3 1
7 2 6
6 2 8
419 5
8 7 9

But if you have difficulty with clear images (no-artifact), you should try with the other psm values or you could center the image using copyMakeBorder

Code:

import cv2
import pytesseract

bgr_image = cv2.imread("GWKS6.png")
gray_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2GRAY)
# centered_image = cv2.copyMakeBorder(gray_image, 100, 100, 100, 100, cv2.BORDER_CONSTANT, value=255)
text = pytesseract.image_to_string(gray_image, config='--psm 6')
print(text)
cv2.imshow('', gray_image)
cv2.waitKey(0)

Upvotes: 1

Related Questions