Dave
Dave

Reputation: 883

Skewing text - How to take advantage of existing edges

I have the following JPG image. If I want to find the edges where the white page meets the black background. So I can rotate the contents a few degrees clockwise. My aim is to straighten the text for using with Tesseract OCR conversion. I don't see the need to rotate the text blocks as I have seen in similar examples.

In the docs Canny Edge Detection the third arg 200 eg edges = cv.Canny(img,100,200) is maxVal and said to be 'sure to be edges'. Is there anyway to determine these (max/min) values ahead of any trial & error approach?

I have used code examples which utilize the Python cv2 module. But the edge detection is set up for simpler applications.

Is there any approach I can use to take the text out of the equation. For example: only detecting edge lines greater than a specified length?

Any suggestions would be appreciated.

Historical Electoral Roll

Below is an example of edge detection (above image same min/max values) The outer edge of the page is clearly defined. The image is high contrast b/w. It has even lighting. I can't see a need for the use of an adaptive threshold. Simple global is working. Its just at what ratio to use it.

I don't have the answer to this yet. But to add. I now have the contours of the above doc.

edges

contours

I used find contours tutorial with some customization of the file loading. Note: removing words gives a thinner/cleaner outline.

Upvotes: 1

Views: 58

Answers (1)

J_H
J_H

Reputation: 20450

Consider Otsu.

Its chief virtue is that it is adaptive to local illumination within the image. In your case, blank margins might be the saving grace.


Consider working on a series of 2x reduced resolution images, where new pixel is min() (or even max()!) of original four pixels. These reduced images might help you to focus on the features that matter for your use case.


The usual way to deskew scanned text is to binarize and then keep changing theta until "sum of pixels across raster" is zero, or small. In particular, with few descenders and decent inter-line spacing, we will see "lots" of pixels on each line of text and "near zero" between text lines, when theta matches the original printing orientation. Which lets us recover (1.) pixels per line, and (2.) inter-line spacing, assuming we've found a near-optimal theta.

In your particular case, focusing on the ... leader dots seems a promising approach to finding the globally optimal deskew correction angle. Discarding large rectangles of pixels in the left and right regions of the image could actually reduce noise and enhance the accuracy of such an approach.

Upvotes: 0

Related Questions