Akashdeep Saluja
Akashdeep Saluja

Reputation: 3089

Removing text while processing the image

I am working on an application where I need feature like Cam Scanner where document is to be detected in an image. For that I am using Canny Edge detection followed by Hough Transform.

The results look promising but the text in the document is creating issues as explained via images below:

Original Image Original Image

After canny edge detection After Canny Edge detcetion

After hough transform After Hough Transform

My issue lies in the third image, the text in original mage near the bottom has forced hough transform to detect the horizontal line(2nd cluster from bottom).

I know I can take the largest quadrilateral and that would work fine in most cases, but still I want to know any other ways where in this processing I can ignore the effect of text on the edges.

Any help would be appreciated.

Upvotes: 5

Views: 2659

Answers (2)

Akashdeep Saluja
Akashdeep Saluja

Reputation: 3089

I solved the issue of text with the help of median filter of size 15(square) in an image of 500x700.

Median filter doesn't affect the boundaries of the paper, but can help eliminate the text completely.

Using that I was able to get much more effective boundaries.

Upvotes: 4

brad
brad

Reputation: 954

Another approach you could try is to use thresholding to find the paper boundaries. This would create a binary image. You can then examine the blobs of white pixels and see if any are large enough to be the paper and have the right dimensions. If it fits the criteria, you can find the min/max points of this blob to represent the paper.

There are several ways to do the thresholding, including iterative, otsu, and adaptive.

Also, for best results you may have to dilate the binary image to close the black lines in the table as shown in your example.

Upvotes: 1

Related Questions