Reputation: 193
i have a image but it is unable to get the price this is what i have
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
print(pytesseract.image_to_string("local-filename.jpg"))
output
Nestle Bakers’
Choice Melts
290g/
Choc Bits
200g
Altimate
Salted Caramel
Waffle Cones
12's
~ Seitarium ss, :
et-E Ly y ”.
oss a
=| x
) " 4
oat
.
FruitCo Juice 2 Litres
‘Apple/ Apricot/ Apple, Mange,
‘Banana/ Apple Pea
Cottee’s Jams
Betty Crocker Triple
500g
Sanitarium Weet-bix
750g Chocolate Muffin Mix 500g
Ss
>
s
Authentic Thai
; Sweet Chili Sauce
Vanilla em, ‘ 725ml
Dell
cours ® ‘OCOMUT HE
Sandhurst Coconut Milk
Chelsea Berry/ Vanilla
400m!
Icing Sugar 3759
Process finished with exit code 0
and this is the image i'am trying to analyze
-what i need is the price of the image with the corresponding name -i am able to extract the name of the product but unable to get the price -how can i achieve this any help would be appreciated please note i am very new at image processing
Upvotes: 1
Views: 566
Reputation: 845
Google Vision API gives the best results. Google cloud offers 300$ free credits to every user.
Below is the code snippet for the same.
def detect_text(path):
"""Detects text in the file."""
from google.cloud import vision
import io
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
Upvotes: 1
Reputation: 3158
tried two options:
here are the results:
image = cv2.imread('795.png')
print(pytesseract.image_to_string(sk1)) # printed spaces i.e no result
import easyocr
reader = easyocr.Reader(['en'],gpu = False) # load model into memory once
result = reader.readtext(image,detail=0) # resul ['s7.95', 'cach']
easyocr worked better!!
next on image with product description
image = cv2.imread('795 Product.png')
reader.readtext(image,detail=0)
'''
['Nestle',
'eaa',
'Nestle',
'RuS',
'aa',
'melts',
'PARKCHOC',
'chocbts',
'Nestle Bakers',
'S',
'Choice Melts',
'290g/',
'cach',
'Choc Bits',
'200g',
'Nestle',
'"8628',
'nelts',
'(Neste)',
'JTE CHOC',
'7.95']
'''
print(pytesseract.image_to_string(image))
'''
Nestle Bakers’
Choice Melts
290g/
Choc Bits
200g
'''
easyocr worked better on these images.
You would need to explore which option you would want to forward with. you can also try the recommendation provided by @nathancy How to process and extract text from image by
Upvotes: 1