Reputation: 646
I have an image where I have a horizontal line underlying the text ; after applying through various techniques in order a. HoughLineP and HoughLine and this code
image = cv2.imread('D:\\detect_words.jpg')
gray = 255 - cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
for row in range(gray.shape[0]):
avg = np.average(gray[row, :] > 16)
if avg > 0.25:
cv2.line(image, (0, row), (gray.shape[1]-1, row), (0, 0, 255))
cv2.line(gray, (0, row), (gray.shape[1]-1, row), (0, 0, 0), 1)
cv2.imwrite('D:\\words\\final_removed.jpg',image)
after this phase; I am applying erosion and dilation
kernel = np.ones((3,3), np.uint8)
img_erosion = cv2.erode(255-gray, kernel, iterations=1)
img_dilation = cv2.dilate(img_erosion, kernel, iterations=1)
cv2.imwrite('D:\\words\\final_removed4.jpg',255-img_dilation)
My question is; removing the horizontal lines although removes but there is pixel loss for words; and not all the horizontal lines are removed. Is there a novel approch where this loss can be minimized and all horizontal lines are removed (here the horizontal lines above AGE is still present).
Upvotes: 3
Views: 1698
Reputation: 46600
Here's an approach:
After converting to grayscale, we Otsu's threshold to obtain a binary image
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
Now we create a special horizontal kernel to detect horizontal lines then morph open to obtain a mask of the detected lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
Here's the detected lines drawn on the original image
From here we find contours on this mask and draw them in to effectively remove the horizontal lines to get our result
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 3)
Now that the horizontal lines are removed, to repair the text, you can try cv2.MORPH_CLOSE
with a cv2.MORPH_CROSS
kernel and experiment with various kernel sizes. There is a tradeoff between dilating too much to close the holes as the detail in the text will be lost. Another approach is to use image inpainting to fill in the holes. I'll leave this step to you
Full code
import cv2
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('detected_lines', detected_lines)
cv2.imshow('image', image)
cv2.waitKey()
Upvotes: 3