Reputation: 8160
This is the image that I will import
My python code
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('/home/milenko/Pictures/Screenshot from 2018-03-06 19-03-19.png')))
When I run code
python a72.py
As an output I got empty line.It does not make any sense. Why?
Upvotes: 0
Views: 1140
Reputation: 1376
Try to tweak your command a little bit using e.g.: other Page Segmentation Method As you can see the default value is "Fully automatic page segmentation, but no OSD." so it does not perform orientation and script detection (OSD).
This one gives me some output:
print(pytesseract.image_to_string(Image.open('image.png'), config='-psm 12'))
You can use OpenCV to prepare this image for OCR, e.g:
#!/usr/bin/python
import cv2 as cv
import numpy as np
import pytesseract
import Image
from matplotlib import pyplot as plt
img = cv.imread('/tmp/image.png',0)
ret,thresh = cv.threshold(img, 220, 255, cv.THRESH_BINARY)
plt.axis('off')
plt.imshow(thresh, 'gray')
plt.show()
print(pytesseract.image_to_string(thresh, config='-psm 12'))
In the next step you could divide this image into some parts (x-axis, y-axis, trend line) and use OCR for each part separately with the proper PSM value set for each one of them.
Upvotes: 2