Reputation: 591
I want to split the characters of an Image using OpenCV in order to train a Tesseract model.
I am using version 3.1.0 (because of a Macports upgrade - meh..), and the documentation (for Python) is still not very clear/well-documented.
Here is what I do:
For each contour:
The new version of OpenCV has somewhat different syntax as well, so this makes it even more tricky sometimes.
Here is my code:
def characterSplit(img):
"""
Splits the characters in an image using contours, ready to be labelled and saved for training with Tesseract
"""
# Apply Thresholding to binarize Image
img = cv2.GaussianBlur(img, (3,3), 0)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 75, 10)
img = cv2.bitwise_not(img)
# Find Contours
contours = cv2.findContours(img, cv2.RETR_EXTERNAL , cv2.CHAIN_APPROX_TC89_KCOS, offset=(0,0))[1]
# Iterate through the contours
for c in xrange(len(contours)):
mask = numpy.zeros(img.size)
cv2.drawContours(mask, contours, c, (0, 255, 0), cv2.FILLED) # Mask is zeros - It might fail here!
# Where the result will be stored
res = numpy.zeros(mask.size)
# Make a Boolean-type numpy array for the Mask
amsk = mask != 0
# I use this to copy a part of the image using the generated mask.
# The result is zeros because the mask is also zeros
numpy.copyto(res, img.flatten(), where = amsk)
## (... Reshape, crop and save the result ...)
As far as I know, the mask should be of the same size as the original image. But should it also have the same shape? For instance, my image is 640x74 but the way I create my mask matrix, my mask is 1x47360. Maybe this is why it fails... (but doesn't throw any errors)
Any help is appreciated!
Upvotes: 0
Views: 1062
Reputation: 591
I ended up doing what Miki proposed in the comments. I used cv::connectedComponents
to do the character splitting. Here is the corresponding code, for anyone who is interested:
def characterSplit(img, outputFolder=''):
# Splits the Image (OpenCV Object) into distinct characters and exports it in images withing the specified folder.
# Blurring the image with Gaussian before thresholding is important
img = cv2.GaussianBlur(img, (3,3), 0)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 75, 10)
img = cv2.bitwise_not(img)
output = cv2.connectedComponentsWithStats(img, 8, cv2.CV_16S)
n_labels = output[0]
labels = output[1]
stats = output[2]
for c in xrange(n_labels):
# Mask is a boolean-type numpy array with True in the corresponding region
mask = labels == c
res = numpy.zeros(img.shape)
numpy.copyto(res, img, where=mask)
# The rectangle that bounds the region is stored in:
# stats[c][0:4] -> [x, y, w, h]
cv2.imwrite("region_{}.jpg".format(c), res)
Hope this helps!
Upvotes: 1