Reputation: 4049
I'm trying to handle tesseract in python to just do simple job: - open a picture - run ocr - get the string - get the characters coordinates
The last one is my pain!
Here is my first code:
import tesseract
import glob
import cv2
api = tesseract.TessBaseAPI()
api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZéèô%")
api.SetPageSegMode(tesseract.PSM_AUTO)
imagepath = "C:\\Project\\Bob\\"
imagePathList = glob.glob(imagepath + "*.jpg")
for image in imagePathList:
mBuffer=open(imagePathList[10],"rb").read()
result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api)
img = cv2.imread(image)
cv2.putText(img,result,(20,20), cv2.FONT_HERSHEY_PLAIN, 1.0,(0,255,0))
cv2.imshow("Original",img)
cv2.waitKey()
As my picture get various layouts, with different words at different positions, I would like to get a box for every char.
I have seen talking about: - api.getBoxText - Hocr
But no way has been found to implement it in Python.
Upvotes: 2
Views: 6512
Reputation: 4277
tesserocr provides the capability to access pretty much all of tesseract's API functionality. Here's an example that might be what you want:
from PIL import Image
from tesserocr import PyTessBaseAPI, RIL
image = Image.open('/usr/src/tesseract/testing/phototest.tif')
with PyTessBaseAPI() as api:
api.SetImage(image)
boxes = api.GetComponentImages(RIL.TEXTLINE, True)
print 'Found {} textline image components.'.format(len(boxes))
for i, (im, box, _, _) in enumerate(boxes):
# im is a PIL image object
# box is a dict with x, y, w and h keys
api.SetRectangle(box['x'], box['y'], box['w'], box['h'])
ocrResult = api.GetUTF8Text()
conf = api.MeanTextConf()
print (u"Box[{0}]: x={x}, y={y}, w={w}, h={h}, "
"confidence: {1}, text: {2}").format(i, conf, ocrResult, **box)
You can also access other API methods such as GetHOCRText
and GetBoxText
among others.
However, right now it only supports *nix systems although a user successfully compiled it on Windows and provided binaries if you'd like to give it a try.
Disclaimer: tesserocr author here.
Upvotes: 3
Reputation: 8355
You may want to call GetHOCRText
method instead, if it's supported by the Python wrapper.
Upvotes: 0