IamKarim1992
IamKarim1992

Reputation: 646

Removing slant horizontal lines in image

original image

I have an image where I have a horizontal line underlying the text ; after applying through various techniques in order a. HoughLineP and HoughLine and this code

 image = cv2.imread('D:\\detect_words.jpg')
 gray = 255 - cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
 for row in range(gray.shape[0]):
    avg = np.average(gray[row, :] > 16)
    if avg > 0.25:
        cv2.line(image, (0, row), (gray.shape[1]-1, row), (0, 0, 255))
        cv2.line(gray, (0, row), (gray.shape[1]-1, row), (0, 0, 0), 1)
  cv2.imwrite('D:\\words\\final_removed.jpg',image)

I am able to get to this after processing

after this phase; I am applying erosion and dilation

kernel = np.ones((3,3), np.uint8) 
img_erosion = cv2.erode(255-gray, kernel, iterations=1) 
img_dilation = cv2.dilate(img_erosion, kernel, iterations=1) 
cv2.imwrite('D:\\words\\final_removed4.jpg',255-img_dilation)

final image after dilation and erosion

My question is; removing the horizontal lines although removes but there is pixel loss for words; and not all the horizontal lines are removed. Is there a novel approch where this loss can be minimized and all horizontal lines are removed (here the horizontal lines above AGE is still present).

Upvotes: 3

Views: 1698

Answers (1)

nathancy
nathancy

Reputation: 46600

Here's an approach:

  • Convert image to grayscale
  • Otsu's threshold to get binary image
  • Create horizontal kernel and morph open to detect lines
  • Find contours and draw in detected lines

After converting to grayscale, we Otsu's threshold to obtain a binary image

enter image description here

image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

Now we create a special horizontal kernel to detect horizontal lines then morph open to obtain a mask of the detected lines

enter image description here

horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)

Here's the detected lines drawn on the original image

enter image description here

From here we find contours on this mask and draw them in to effectively remove the horizontal lines to get our result

enter image description here

cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

for c in cnts:
    cv2.drawContours(image, [c], -1, (255,255,255), 3)

Now that the horizontal lines are removed, to repair the text, you can try cv2.MORPH_CLOSE with a cv2.MORPH_CROSS kernel and experiment with various kernel sizes. There is a tradeoff between dilating too much to close the holes as the detail in the text will be lost. Another approach is to use image inpainting to fill in the holes. I'll leave this step to you

Full code

import cv2

image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)

cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

for c in cnts:
    cv2.drawContours(image, [c], -1, (255,255,255), 3)

cv2.imshow('thresh', thresh)
cv2.imshow('detected_lines', detected_lines)
cv2.imshow('image', image)
cv2.waitKey()

Upvotes: 3

Related Questions