Reputation:
Input:
The output should be:
How can I do this using OpenCV or any other method? I tried this
img = cv2.imread('test2.JPG')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(imgray, 150, 255, 0)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
print("Number of contours = " + str(len(contours)))
print(contours[0])
# cv2.drawContours(img, contours, -1, (0, 255, 0), 1)
# cv2.drawContours(imgray, contours, -1, (0, 255, 0), 3)
for cnt in contours:
area = cv2.contourArea(cnt)
if area>20:
peri = cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt, 0.02 * peri, True)
x,y,w,h = cv2.boundingRect(approx)
cv2.rectangle(img,(x,y-3),(x+w,y+h-3),(255,0,0),1)
Upvotes: 0
Views: 133
Reputation: 21203
You have found the contours and drawn only those above certain area. So far so good.
To capture each line as an individual entity, you need to find a way to connect the text in each line. Since the lines given in the image a straight, a simple approach would be to use a horizontal kernel ([1, 1, 1, 1, 1]
) of certain length and perform morphology.
Code:
img = cv2.imread('text.jpg',1)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
th = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
Using horizontal kernel 8 pixels in length. This is the parameter you would need to change when trying out for other images of different font size and text length.
hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (8, 1))
# array([[1, 1, 1, 1, 1, 1, 1, 1]], dtype=uint8)
dilated = cv2.dilate(th, hor_kernel, iterations=1)
Looking at the image above, hope you have an idea of what dilation using a horizontal kernel does. From here on, we find outermost contours above certain area.
contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
img2 = img.copy()
for i, c in enumerate(contours):
area = cv2.contourArea(c)
if area > 100:
x,y,w,h = cv2.boundingRect(c)
img2 = cv2.rectangle(img2, (x, y), (x + w, y + h), (0,255,0), 1)
Upvotes: 2