Dhaval Bhavsar
Dhaval Bhavsar

Reputation: 495

how to remove other noise from image using opencv

Here I am using below script to remove black spot near the image and remove line-through above number but it removes noise but not properly.

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)

    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1, 1), np.uint8)
    img = cv2.dilate(img, kernel, iterations=12)
    img = cv2.erode(img, kernel, iterations=12)

    # Write image after removed noise
    cv2.imwrite(src_path + "removed_noise.png", img)

    #  Apply threshold to get image with only black and white
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)

    # Write the image after apply opencv to do some ...
    cv2.imwrite(src_path + "thres.png", img)

    # Recognize text with tesseract for python
    result = pytesseract.image_to_string(Image.open(src_path + "vertical_final.jpg"))

    # Remove template file
    #os.remove(temp)

    return result

but it's not working properly.

Input image:

img

Output Image:-

img

I need someone to help me out from these problems it's highly appreciated. Source Code:-

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)

    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1,20), np.uint8)
    img = cv2.dilate(img, kernel, iterations=1)
    img = cv2.erode(img, kernel, iterations=1)

    #img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

    kernel = np.ones((1, 1), np.uint8)
    #img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

    cv2.imwrite(src_path + "removed_noise.png", img)

    img3 = cv2.subtract(cv2.imread(src_path + "removed_noise.png"),cv2.imread(src_path + "tax_amount.png"))

    cv2.imwrite(src_path + "removed_noise_makes_00.png", img3)

    lower_black = np.array([0,0,0], dtype = "uint16")
    upper_black = np.array([70,70,70], dtype = "uint16")
    black_mask = cv2.inRange(img3, lower_black, upper_black)
    black_mask[np.where((black_mask == [0] ).all(axis = 1))] = [255]

    opening = cv2.morphologyEx(black_mask, cv2.MORPH_CLOSE, kernel)

    cv2.imwrite(src_path + "removed_noise_makes_00_1.png", opening)

    # Recognize text with tesseract for python
    result = pytesseract.image_to_string(Image.open(src_path + "removed_noise_makes_00_1.png"))

    # Remove template file
    #os.remove(temp)

    return result

Upvotes: 0

Views: 3479

Answers (1)

Cris Luengo
Cris Luengo

Reputation: 60780

Where you do

kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=12)

You apply 12 times a dilation with a 1x1 structuring element (SE). Unless OpenCV does something special with such a SE, this code should not change your image at all.

You should create a larger SE:

kernel = np.ones((7, 7), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)

This will first dilate and then erode the result. What this accomplishes is that small (thin) black regions disappear. These are the regions where the SE didn't fit. This is the same as

img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

To remove the long line, you want to apply a closing with an elongated SE:

kernel = np.ones((1, 30), np.uint8)
line = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

This leaves only the horizontal line. The difference of img and line is the text without the line.

If you think of img as the sum of line and text, then img - line will be text. However, there is a small problem still: img has white background (255), and black foreground. So really, it is img = 255 - text - line, and the line image you found above is really 255 - line, because it also has white background. So directly taking the difference will not produce the desired effect.

The solution is to invert your images first:

img = 255 - img;
line = 255 - line;
text = img - line;

Upvotes: 4

Related Questions