Remove the black line surrounding text in opencv

Question

I am trying to remove the black lines surrounding the text if present any. My purpose is to just have enough portion of the image to extract each character in it. The additional black lines are noise when i am trying to extract characters.

I have tried using floodfill in opencv but the image contains some white pixels before the black line starts in the upper left corner. So it hasn't been fruitful. I tried cropping by means of finding contours but even that does not work. The image is as follows:

Original Image

import cv2
import numpy as np

img = cv2.imread('./Cropped/22.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray,1,255,cv2.THRESH_BINARY)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
x,y,w,h = cv2.boundingRect(cnt)
crop = img[y:y+h,x:x+w]

cv2.imshow('Image',img)
cv2.imshow('Cropped Image',crop)
cv2.waitKey(0)

and using floodfill

img = cv2.imread('./Cropped/22.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# threshold the gray image to binarize, and negate it
gray = cv2.bitwise_not(gray)
w = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, \
                          cv2.THRESH_BINARY, 15, -2)

# find external contours of all shapes
contours,h = cv2.findContours(bw, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

# create a mask for floodfill function, see documentation
h,w,_ = img.shape
mask = np.zeros((h+2,w+2), np.uint8)

# determine which contour belongs to a square or rectangle
for cnt in contours:
    poly = cv2.approxPolyDP(cnt, 0.02*cv2.arcLength(cnt,True),True)
    if len(poly) == 4:
        # if the contour has 4 vertices then floodfill that contour with black color
        cnt = np.vstack(cnt).squeeze()
        _,binary,_,_ = cv2.floodFill(bw, mask, tuple(cnt[0]), 0)
# convert image back to original color
binary = cv2.bitwise_not(binary)        

cv2.imshow('Image', binary)
cv2.waitKey(0)
cv2.destroyAllWindows()

The results in the two cases are as follows

Cropped Image

But there appears to be no change and

Using floodfill

which does not remove any borders. The ideas of both the codes was obtained from stack overflow answers to similar questions.

EDIT

I approached the solution as mentioned in the comment by @rayryeng. However when i enter the cropped up image for number extraction i get these images and a wrong result. I guess some noisy pixels aren't getting removed. This is the original image Original Image. The thresholded image is Thresholding Image . The contours extracted are as follows First contour, Second contour, Third contour, Fourth contour. If there could be a generalised solution to this, it would be great.

Remove the black line surrounding text in opencv

Answers (1)

Related Questions