user16478461
user16478461

Reputation:

Need help in extracting lines from this image using opencv

Input:

Input

The output should be:

Output

How can I do this using OpenCV or any other method? I tried this

img = cv2.imread('test2.JPG')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

ret, thresh = cv2.threshold(imgray, 150, 255, 0)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
print("Number of contours = " + str(len(contours)))
print(contours[0])

# cv2.drawContours(img, contours, -1, (0, 255, 0), 1)
# cv2.drawContours(imgray, contours, -1, (0, 255, 0), 3)

for cnt in contours:
    area = cv2.contourArea(cnt)
    
    if area>20:
        peri = cv2.arcLength(cnt,True)
        approx = cv2.approxPolyDP(cnt, 0.02 * peri, True)
        x,y,w,h = cv2.boundingRect(approx)
        cv2.rectangle(img,(x,y-3),(x+w,y+h-3),(255,0,0),1)

To get this output

Upvotes: 0

Views: 133

Answers (1)

Jeru Luke
Jeru Luke

Reputation: 21203

You have found the contours and drawn only those above certain area. So far so good.

To capture each line as an individual entity, you need to find a way to connect the text in each line. Since the lines given in the image a straight, a simple approach would be to use a horizontal kernel ([1, 1, 1, 1, 1]) of certain length and perform morphology.

Code:

img = cv2.imread('text.jpg',1)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
th = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]

enter image description here

Using horizontal kernel 8 pixels in length. This is the parameter you would need to change when trying out for other images of different font size and text length.

hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (8, 1))
# array([[1, 1, 1, 1, 1, 1, 1, 1]], dtype=uint8)

dilated = cv2.dilate(th, hor_kernel, iterations=1)

enter image description here

Looking at the image above, hope you have an idea of what dilation using a horizontal kernel does. From here on, we find outermost contours above certain area.

contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

img2 = img.copy()
for i, c in enumerate(contours):
  area = cv2.contourArea(c)
  if area > 100:
    x,y,w,h = cv2.boundingRect(c)
    img2 = cv2.rectangle(img2, (x, y), (x + w, y + h), (0,255,0), 1)

enter image description here

Upvotes: 2

Related Questions