Reputation: 31
So, I have been experimenting with Pytesseract for a short while. I want it to read a small part of the screen, and return what it says. The image I want it to read is:
This is my code currently (following an Youtube video):
import cv2
import pytesseract as tes
tes.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread("DoneCheck.png")
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
G = []
print(tes.image_to_string(img))
G.append(tes.image_to_string(img))
print(G)
hImg, wImg,_ = img.shape
boxes = tes.image_to_boxes(img)
for b in boxes.splitlines():
print(b)
b = b.split(' ')
#print(b)
x,y,w,h = int(b[1]), int(b[2]), int(b[3]), int(b[4])
cv2.rectangle(img, (x, hImg-y), (w, hImg - h), (0, 0, 255), 1)
cv2.imshow("result", img)
cv2.waitKey()
And this is what it prints:
" Done Canceling, ♀, ['Done Canceling\n\x0c'], D 12 8 23 19 0, o 20 8 29 19 0, n 25 8 34 17 0, e 36 8 55 17 0, C 62 8 72 19 0, a 74 8 81 17 0, n 78 5 88 20 0, c 84 8 93 17 0, e 95 8 102 17 0, l 105 8 113 17 0, i 115 8 122 20 0, n 125 8 134 17 0, g 136 5 145 17 0, " (couldn't get it on sperate lines, so each comma is the start of a new line)
Now, the last part is completely fine. But in the first part, there is the "♀" sign, and when converted to list, it is "\n\x0c". That is the part that I'm wondering about, and don't understand. I'm mostly doing this project for fun and learning. So if anyone could explain to me what it means/is, that would be great:). And if I forgot some crucial information, please let me know (new user)
Upvotes: 2
Views: 2903
Reputation: 16
for your config all you need to have is
xconfig = "-c page_separator=''"
for some reason pytesseract has it on by default.
import cv2
import pytesseract as tes
tes.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
xconfig = "-c page_separator=''"
img = cv2.imread("DoneCheck.png")
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
G = []
print(tes.image_to_string(img, config=xconfig))
G.append(tes.image_to_string(img, config=xconfig))
print(G)
hImg, wImg,_ = img.shape
boxes = tes.image_to_boxes(img)
for b in boxes.splitlines():
print(b)
b = b.split(' ')
#print(b)
x,y,w,h = int(b[1]), int(b[2]), int(b[3]), int(b[4])
cv2.rectangle(img, (x, hImg-y), (w, hImg - h), (0, 0, 255), 1)
cv2.imshow("result", img)
cv2.waitKey()
i found the answer at https://askubuntu.com/questions/1276440/why-does-tesseract-append-l-to-the-output
Upvotes: 0
Reputation: 1
My solution:
import pytesseract
from pytesseract import Output
...
xconfig = '--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789'
x1 = pytesseract.image_to_data(image, output_type=Output.DICT, config=xconfig)
print("value: ", int(x1["text"][4]))
output --> value: 6
Upvotes: 0
Reputation: 7985
If you upsample:
img = cv2.resize(img, (0, 0), fx=2, fy=2)
and if you apply simple-threshold:
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
Now if you read:
print(pytesseract.image_to_string(thr))
Result:
Done Canceling
You can get the same result with 0.3.7
Make sure to read: Improving the quality of the output.
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("hocpQ.png")
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Simple-threshold
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# OCR
print(pytesseract.image_to_string(thr))
# Display
cv2.imshow("result", thr)
cv2.waitKey(0)
Upvotes: 1