Dynamicnotion
Dynamicnotion

Reputation: 313

Tesseract - digit regonition with many errors

I want to be able to recognize digits from images. So I have been playing around with tesseract and python. I looked into how to prepare the image and tried running tesseract on it and I must say I am pretty disappointed by how badly my digits are recognized. I have tried to prepare my images with OpenCV and thought I did a pretty good job (see examples below) but tesseract has a lot of errors when trying to identify my images. Am I expecting too much here? But when I look at these example images I think that tesseract should easily be able to identify these digits without any problems. I am wondering if the accuracy is not there yet or if somehow my configuration is not optimal. Any help or direction would be gladly appreciated.

Things I tried to improve the digit recognition: (nothing seemed to improved the results significantly)

Examples:

Image 1:

Tesseract recognized: 72 enter image description here

Image 2:

Tesseract recognized: 0 enter image description here

EDIT: Image 3:

https://ibb.co/1qVtRYL

Tesseract recognized: 1723

Upvotes: 0

Views: 951

Answers (1)

Ian Chu
Ian Chu

Reputation: 3143

I'm not sure what's going wrong for you. I downloaded those images and tesseract interprets them just fine for me. What version of tesseract are you using (I'm using 5.0)?

781429

209441

import pytesseract
import cv2
import numpy as np
from PIL import Image

# set path
pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\ichu\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe';

# load images
first = cv2.imread("first_text.png");
second = cv2.imread("second_text.png");
images = [first, second];

# convert to pillow
pimgs = [];
for img in images:
    rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB);
    pimgs.append(Image.fromarray(rgb));

# do text
for img in pimgs:
    text = pytesseract.image_to_string(img, config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789');
    print(text[:-2]); # drops newline + end char

Upvotes: 1

Related Questions