Reputation: 281
I have the following function to pre-process an image for Tesseract OCR, in most of the image the text is white, there can be green, red and purple text too. I want to be able to read all of that, but when I apply the thresholding during the pre-processing the red text is gone. Is there a way to avoid this? It doesn't happen with the green text unless it's dark green
def pre_process_img(img):
open_cv_image = numpy.array(img)
# Convert RGB to BGR
open_cv_image = open_cv_image[:, :, ::-1].copy()
img_gray = cv2.cvtColor(numpy.array(img), cv2.COLOR_BGR2GRAY)
img_gray = cv2.resize(img_gray, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)
img_inverted = 255 - img_gray
ret, thresh1 = cv2.threshold(img_inverted, 127, 255, cv2.THRESH_BINARY)
# [DEBUG] show pre processed image
# cv2.imshow("inverted", thresh1)
# cv2.waitKey(0)
return thresh1
In this function img is a PIL.Image.Image image, I convert it to an OpenCV image and apply preprocessing (turning into greyscale, rezising, inverting and binary thresholding). With psm 11 on Tesseract it has given a good enough result.
Btw If you have any suggestion to improve my pre_process_img function I'm open to listen. I'm new to OpenCV and I just stuck with the thing that gave me the best result from everything I've tried
This is my image here
Upvotes: 2
Views: 190
Reputation: 53081
Convert from BGR to HSV colorspace in Python/OpenCV. Then simply threshold the value channel. Here is the value channel. You will see that all text is white (in this case).
Upvotes: 0