Dds 03
Dds 03

Reputation: 53

How do you use pytesseract in a flask app

I want to build a flask app that allows users to upload an image with text on it. Then I want pytesseract to extract the text and return it.

I did some research and found this article: https://stackabuse.com/pytesseract-simple-python-optical-character-recognition/#disqus_thread

It pretty much explains what I want to do. The only thing I don't understand is where I am supposed to save the OCR Script in which the ocr_core function is defined. Because in the article he is later able to import the function.

Upvotes: 0

Views: 1957

Answers (1)

v25
v25

Reputation: 7656

That file should be called ocr_core.py and be saved in the same directory as app.py.

Consider ocr_core.py:

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract

def ocr_core(filename):
    """
    This function will handle the core OCR processing of images.
    """
    text = pytesseract.image_to_string(Image.open(filename))  # We'll use Pillow's Image class to open the image and pytesseract to detect the string in the image
    return text

Within app.py the line from ocr_core import ocr_core imports the function ocr_core from the module ocr_core.

To put it another way: from some_module import some_func would import the some_func function from the some_module module which is in the file some_module.py.

Also where the tutorial has, towards the end of ocr_core.py:

print(ocr_core('images/ocr_example_1.png'))

This line would technically be executed at the point where that import runs in app.py. It looks like a line for testing the functionality. Typically this should be in a block like:

if __name__ == '__main__':
    print(ocr_core('images/ocr_example_1.png'))

Which means it will only be executed when you run with the python interpreter python ocr_core.py - NOT when that module is imported somewhere. As per the tutorial that print line would run when you start the Flask server, which may be undesirable.

Upvotes: 1

Related Questions