Reputation: 713
The code I've produce to detect and correct skew is giving me inconsistent results. I'm currently working on a project which utilizes OCR text extraction on images (via Python and OpenCV), so removing skew is key if accurate results are desired. My code uses cv2.minAreaRect
to detect skew.
The images I'm using are all identical (and will be in the future) so I'm unsure as to what is causing these inconsistencies. I've included two sets of before and after images (including the skew value from cv2.minAreaRect
) where I applied my code, one showing successul removal of skew and showing skew was not removed (looks like it added even more skew).
Image 1 Before (-87.88721466064453
)
Image 1 After (successful deskew)
Image 2 Before (-5.766754150390625
)
Image 2 After (unsuccessful deskew)
My code is below. Note: I've worked with many more images than those I've included here. The detected skew thus far has always been in the ranges [-10, 0) or (-90, -80], so I attempted to account for this in my code.
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray = cv2.bitwise_not(img_gray)
thresh = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
if (angle < 0 and angle >= -10):
angle = -angle #this was intended to undo skew for values in [-10, 0) by simply rotating using the opposite sign
else:
angle = (90 + angle)/2
(h, w) = img.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
deskewed = cv2.warpAffine(img, M, (w, h), flags = cv2.INTER_CUBIC, borderMode = cv2.BORDER_REPLICATE)
I've looked through various posts and articles to find an adequate solution, but have been unsuccessful. This post was the most helpful in understanding the skew values, but even then I couldn't get very far.
Upvotes: 4
Views: 9188
Reputation: 1614
I already answered this here: How to deskew a scanned text page with ImageMagick?
Following is the piece of code that can help you deskew the image:
import numpy as np
from skimage import io
from skimage.transform import rotate
from skimage.color import rgb2gray
from deskew import determine_skew
from matplotlib import pyplot as plt
def deskew(_img):
image = io.imread(_img)
grayscale = rgb2gray(image)
angle = determine_skew(grayscale)
rotated = rotate(image, angle, resize=True) * 255
return rotated.astype(np.uint8)
def display_before_after(_original):
plt.subplot(1, 2, 1)
plt.imshow(io.imread(_original))
plt.subplot(1, 2, 2)
plt.imshow(deskew(_original))
display_before_after('img_35h.jpg')
Reference and Source: http://aishelf.org/deskew/
Upvotes: 2
Reputation: 53081
A very good text deskew tool can be found in Python Wand, which uses ImageMagick. It is based upon the Radon transform.
Form 1:
Form 2:
from wand.image import Image
from wand.display import display
with Image(filename='form1.png') as img:
img.deskew(0.4*img.quantum_range)
img.save(filename='form1_deskew.png')
display(img)
with Image(filename='form2.png') as img:
img.deskew(0.4*img.quantum_range)
img.save(filename='form2_deskew.png')
display(img)
Form 1 deskewed:
Form 2 deskewed:
Upvotes: 10