Reputation: 185
Ok so I've been trying to change my image to whatever works, but I cannot seem to find the right settings..
As you can see picture is already as simple as anything, but it still cannot recognize '1 BB' from the Image.. Any tips?
img = Image.fromarray(img)
imp_arr = np.asarray(img)
imp_arr = (np.floor(imp_arr / 140.0) * 255.0).astype('uint8')
img = Image.fromarray(imp_arr, mode='L')
width, height = img.size
img = img.resize((width*3, height*3), Image.BICUBIC)
width, height = img.size
img = img.resize((width*2, height*2), Image.HAMMING)
width, height = img.size
img = img.resize((int(width*0.3), int(height*0.3)), Image.BICUBIC)
img = ImageEnhance.Brightness(img).enhance(0.7)
img = ImageEnhance.Sharpness(img).enhance(2)
img = ImageEnhance.Contrast(img).enhance(2)
amount = pytesseract.image_to_string(img, config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
This is just an example, of what I've tried to adjust it correctly to get the correct text to string. Some of the times it works other times it prints out gibberish. The thing is.. It needs to work every single time, expecially for a picture as clear as this one. Is there a mastermind who has a simple solution to this problem? Thank you in advance.
Upvotes: 1
Views: 1141
Reputation: 1099
After installing Tesseract OCR, Pillow and pytesseract, I saved your image as igor.png
and ran the following code, which I found in the docs of pytesseract:
#!/usr/bin/env python
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open("igor.png")))
It prints the expected result:
1BB
If I correct a bit your initial code by adding the letter B
to the tessedit_char_whitelist
, it works as well.
Upvotes: 1