How do you use pytesseract in a flask app

Question

I want to build a flask app that allows users to upload an image with text on it. Then I want pytesseract to extract the text and return it.

I did some research and found this article: https://stackabuse.com/pytesseract-simple-python-optical-character-recognition/#disqus_thread

It pretty much explains what I want to do. The only thing I don't understand is where I am supposed to save the OCR Script in which the ocr_core function is defined. Because in the article he is later able to import the function.

v25 · Accepted Answer

That file should be called ocr_core.py and be saved in the same directory as app.py.

Consider ocr_core.py:

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract

def ocr_core(filename):
    """
    This function will handle the core OCR processing of images.
    """
    text = pytesseract.image_to_string(Image.open(filename))  # We'll use Pillow's Image class to open the image and pytesseract to detect the string in the image
    return text

Within app.py the line from ocr_core import ocr_core imports the function ocr_core from the module ocr_core.

To put it another way: from some_module import some_func would import the some_func function from the some_module module which is in the file some_module.py.

Also where the tutorial has, towards the end of ocr_core.py:

print(ocr_core('images/ocr_example_1.png'))

This line would technically be executed at the point where that import runs in app.py. It looks like a line for testing the functionality. Typically this should be in a block like:

if __name__ == '__main__':
    print(ocr_core('images/ocr_example_1.png'))

Which means it will only be executed when you run with the python interpreter python ocr_core.py - NOT when that module is imported somewhere. As per the tutorial that print line would run when you start the Flask server, which may be undesirable.

How do you use pytesseract in a flask app

Answers (1)

Related Questions