Reputation: 35
I am preparing an image for tesseract
to ocr
. What I have done so far is converting my image to the following:
What I basically want is to cut the image to horizontal portions based on the white regions. So like so:
What I most care about are the text area in the left side and the middle one.
The problem if I only pick left region is that I can't find a way to pick also the ones in the middle without deleting some parts.
The other problem that I faced is if I give tesseract
all the regions (I have successfully already extracted every region that contains text) is that it gave me rubbish since the pic has both Latin and none latin language.
Another important thing is that there is no predefined size, so assuming the size in this picture is standard is wrong.
To recapitulate: how to cut image horizontally based on white regions
Upvotes: 2
Views: 153
Reputation: 21233
I looked up the documentation to see if I could use anything. And YES I came across an interesting property called the extent of a contour from THIS PAGE.
The extent of a contour is defined as the ration of area of the contour by area of the bounding rectangle of that contour. So the more closer this value is to 1 the more the contour resembles a rectangle.
For the image given by you it does not detect the words that look like Arabic. But it would work if some morphological operation is done prior to this.
Code:
path = 'C:/Users/Desktop/Stack/contour/'
im = cv2.imread(path + 'lic.png')
#--- resized because the image was to big ---
im = cv2.resize(im, (0, 0), fx = 0.5, fy = 0.5)
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret2, th2 = cv2.threshold(imgray, 0, 255,cv2.THRESH_BINARY + cv2.THRESH_OTSU)
im2 = im.copy()
_, contours, hierarchy = cv2.findContours(th2, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
count = 0
#--- It all begins here ---
for cnt in contours:
area = cv2.contourArea(cnt)
x, y, w, h = cv2.boundingRect(cnt)
rect_area = w * h
extent = float(area) / rect_area
if (extent > 0.5) and (area > 100): #--- there were some very small rectangular regions hence I used the area criterion ---
count+=1
cv2.drawContours(im2, [cnt], 0, (0, 255, 0), 2)
cv2.imshow(path + 'contoursdate.jpg', im2)
print('Number of possible words : {}'.format(count))
Result:
In this case I have just drawn the contours. You on the other hand, can crop these regions by fitting a bounding rectangle and feed them individually to an OCR engine.
Upvotes: 2
Reputation: 3491
You can play with the parameters to increase or decrease the number of lines. I followed this guide
Loading and inverting the image:
import cv2
import numpy as np
img = cv2.imread('lic.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = 255 - img
Getting the edges:
edges = cv2.Canny(gray,50,150,apertureSize = 5)
minLineLength = 10
maxLineGap = 30
Finding the lines with a Probabilistic Hough Transform:
lines = cv2.HoughLinesP(edges,.7,np.pi/180, 100,minLineLength,maxLineGap)
for line in lines:
for x1,y1,x2,y2 in line:
if x2-x1 == 0:
continue
Checking that the slope is between -45 degrees and 45 degrees (you can adjust as needed):
dy = (y2 - y1)
dx = (x2 -x1)
if -1 < dy/dx < 1:
cv2.line(img,(x1 + dx*-100,y1 + dy*-100),(x2 + dx*100,y2 + dy*100),(0,255,0),2)
cv2.imshow("image: " + str(len(lines)) , img)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('houghlines3.jpg',img)
Which produced this image:
Upvotes: 2