Reputation: 23
I am trying to do number plate recognition using tesseract 4.0.0-beta.1. In tesseract documentation, it is told to create box files in the form . I tried using "makebox" function. But, it is not detecting every character properly. Then, somewhere i read that this function is for version 3.x.
I later tried "wordstrbox" function. But the box file which is created in this way is empty. Can someone tell me how to create box files for tesseract 4.0.0-beta.1.
Upvotes: 1
Views: 8187
Reputation: 174
I've found AlfyFaisy's answer very helpful and just wanted to share the code to view the bounding boxes of single characters. The differences regard the keys of the dictionary that is output by the image_to_boxes
method:
import pytesseract
import cv2
from pytesseract import Output
img = cv2.imread('image.png')
height = img.shape[0]
width = img.shape[1]
d = pytesseract.image_to_boxes(img, output_type=Output.DICT)
n_boxes = len(d['char'])
for i in range(n_boxes):
(text,x1,y2,x2,y1) = (d['char'][i],d['left'][i],d['top'][i],d['right'][i],d['bottom'][i])
cv2.rectangle(img, (x1,height-y1), (x2,height-y2) , (0,255,0), 2)
cv2.imshow('img',img)
cv2.waitKey(0)
At least on my machine (Python 3.6.8, cv2 4.1.0) the cv2 method is waitKey(0)
with a capital K.
This is the output I got:
Upvotes: 5
Reputation: 444
Use pytesseract.image_to_data()
import pytesseract
import cv2
from pytesseract import Output
img = cv2.imread('image.jpg')
d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
(text,x,y,w,h) = (d['text'][i],d['left'][i],d['top'][i],d['width'][i],d['height'][i])
cv2.rectangle(img, (x,y), (x+w,y+h) , (0,255,0), 2)
cv2.imshow('img',img)
cv2.waitkey(0)
Among the data returned by pytesseract.image_to_data():
left
is the distance from the upper-left corner of the bounding box,
to the left border of the image.top
is the distance from the upper-left corner of the bounding box,
to the top border of the image.width
and height
are the width and height of the bounding box.conf
is the model's confidence for the prediction for the word within
that bounding box. If conf
is -1, that means that the corresponding
bounding box contains a block of text, rather than just a single
word.The bounding boxes returned by pytesseract.image_to_boxes()
enclose letters so I believe pytesseract.image_to_data()
is what you're looking for.
Upvotes: 5