Reputation: 77
I'm creating a bot for a video game, everything is working well (thanks to some stackoverflow members), but pytesseract response time is too high.
I have to read a picture of this kind every second (after editing it to turn it into black over white, very quick process that doesn't take time).
What I'm doing is dividing the picture into 9, one for each line, and then call pytesseract.image_to_string(img)
for each.
This process takes about 3 seconds, and I think it can be faster, given that the text is short.
I noticed a high disk I/O in Process Hacker, see the following screenshot : Disk I/O
Last thing, I have the feeling that it's a bit better when executing the python script as administrator, but I'm not sure and it's not enough..
Do you have a solution that I can implement to make it faster ?
Upvotes: 1
Views: 2482
Reputation: 3328
You need to use tesseract api instead of pytesseract, that initialize tesseract (e.g. read traineddata) each time you run ocr (and store ocr image to disk and read ocr result from disk...). For example have a look at https://github.com/zdenop/SimpleTesseractPythonWrapper/blob/master/SimpleTesseractPythonWrapper.ipynb
Upvotes: 2