Reputation: 53
I want to build a flask app that allows users to upload an image with text on it. Then I want pytesseract to extract the text and return it.
I did some research and found this article: https://stackabuse.com/pytesseract-simple-python-optical-character-recognition/#disqus_thread
It pretty much explains what I want to do. The only thing I don't understand is where I am supposed to save the OCR Script in which the ocr_core function is defined. Because in the article he is later able to import the function.
Upvotes: 0
Views: 1957
Reputation: 7656
That file should be called ocr_core.py
and be saved in the same directory as app.py
.
Consider ocr_core.py
:
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
def ocr_core(filename):
"""
This function will handle the core OCR processing of images.
"""
text = pytesseract.image_to_string(Image.open(filename)) # We'll use Pillow's Image class to open the image and pytesseract to detect the string in the image
return text
Within app.py
the line from ocr_core import ocr_core
imports the function ocr_core
from the module ocr_core
.
To put it another way: from some_module import some_func
would import the some_func
function from the some_module
module which is in the file some_module.py
.
Also where the tutorial has, towards the end of ocr_core.py
:
print(ocr_core('images/ocr_example_1.png'))
This line would technically be executed at the point where that import runs in app.py
. It looks like a line for testing the functionality. Typically this should be in a block like:
if __name__ == '__main__':
print(ocr_core('images/ocr_example_1.png'))
Which means it will only be executed when you run with the python interpreter python ocr_core.py
- NOT when that module is imported somewhere. As per the tutorial that print line would run when you start the Flask server, which may be undesirable.
Upvotes: 1