Reputation: 72616
I have an image like the following:
and I would want to extract the text from it, that should be ws35
, I've tried with pytesseract library using the method :
pytesseract.image_to_string(Image.open(path))
but it returns nothing... Am I doing something wrong? How can I get back the text using the OCR ? Do I need to apply some filter on it ?
Upvotes: 0
Views: 640
Reputation:
Similar to @SilverMonkey's suggestion: Gaussian blur followed by Otsu thresholding.
Upvotes: 1
Reputation: 955
You may need apply some image processing/enhancement on it. Look at this post read suggestions and try to apply.
Upvotes: 0
Reputation: 1003
You can try the following approach:
Because i personally do not use tesseract i am not able to try this picture, but online ocr tools seem to be able to identify the sequence correctly (especially if you use the blurred version).
Upvotes: 5
Reputation: 2794
The problem is that this picture is low quality and very noisy! even proffesional and enterprisal programs are struggling with this
you have most likely seen a capatcha before and the reason for those is because its sent back to a database with your answer and the image and then used to train computers to read images like these.
short answer is: pytesseract cant read the text inside this image and most likely no module or proffesional programs can read it either.
Upvotes: 0