Shashank
Shashank

Reputation: 1135

OpenCV: How to remove text from background

Right now I am trying to create one program, which remove text from background but I am facing a lot of problem going through it

My approach is to use pytesseract to get text boxes and once I get boxes, I use cv2.inpaint to paint it and remove text from there. In short:

d = pytesseract.image_to_data(img, output_type=Output.DICT) # Get text
n_boxes = len(d['level']) # get boxes
for i in range(n_boxes): #  Looping through boxes
    # Get coordinates
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    crop_img = img[y:y+h, x:x+w] # Crop image
    gray = cv2.cvtColor(crop_img, cv2.COLOR_BGR2GRAY)
    gray = inverte(gray) # Inverse it
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)[1]
    dst = cv2.inpaint(crop_img, thresh, 10, cv2.INPAINT_TELEA) # Then Inpaint
    img[y:y+h, x:x+w] = dst # Place back cropped image back to the source image

Now the problem is that I am not able to remove text completely Image: enter image description here

Now I am not sure what other method I can use to remove text from image, I am new to this that's why I am facing problem. Any help is much appreciated

Note: Image looks stretched because I resized it to show it in screen size

Original Image:

enter image description here

Upvotes: 5

Views: 14329

Answers (1)

nathancy
nathancy

Reputation: 46670

Here's an approach using morphological operations + contour filtering

  • Convert image to grayscale
  • Otsu's threshold to obtain a binary image
  • Perform morph close to connect words into a single contour
  • Dilate to ensure that all bits of text are contained in the contour
  • Find contours and filter using contour area
  • Remove text by "filling" in the contour rectangle with the background color

I used chrome developer tools to determine the background color of the image which was (222,228,251). If you want to dynamically determine the background color, you could try finding the dominant color using k-means. Here's the result

enter image description here

import cv2

image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, close_kernel, iterations=1)

dilate_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,3))
dilate = cv2.dilate(close, dilate_kernel, iterations=1)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area > 800 and area < 15000:
        x,y,w,h = cv2.boundingRect(c)
        cv2.rectangle(image, (x, y), (x + w, y + h), (222,228,251), -1)

cv2.imshow('image', image)
cv2.waitKey()

Upvotes: 7

Related Questions