Bob Stone
Bob Stone

Reputation: 100

How to solve error pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

I'm getting the error pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path. I tested my program just minutes before this came up and it worked perfectly. Then I tested it again and it keeps showing this error. I don't know what to do. Here is my code:

from PIL import ImageGrab
import cv2
import pytesseract
import numpy as np
from tkinter import Tk
from tkinter.filedialog import askopenfilename
ask = input("Do you want to ocr in realtime or choose a picture (r/p)?")
if ask == 'r':
    while True:
        screen = np.array(ImageGrab.grab(bbox=(700, 300, 1600, 1000)))
        # print('Frame took {} seconds'.format(time.time()-last_time))
        cv2.imshow('window', screen)
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break
        print(pytesseract.image_to_string(screen, lang='eng', config='--psm 6'))
if ask == 'p':
    Tk().withdraw()  # we don't want a full GUI, so keep the root window from appearing
    filename = askopenfilename()  # show an "Open" dialog box and return the path to the selected file
    print(pytesseract.image_to_string(filename, lang='eng', config='--psm 6'))

Upvotes: 1

Views: 6884

Answers (4)

Trees
Trees

Reputation: 1293

The installation procedure and the trained data file are the most important. For example, Arabic language requires ara.traindata file. I suggest using the proper language model and the latest version:

For Windows 10:

tesseract-ocr-w64-setup-v5.0.0-alpha.20200328.exe (64 bit) resp.

To validate installation in the power shell or cmd terminal execute:

tesseract -v

It will output something like this: tesseract v5.0.0-alpha.20200328

For Mac OS:

brew install tesseract

To validate installation in the power shell or cmd terminal execute:

tesseract -v

It will output something like this: tesseract 4.1.1 and also the installed image libraries leptonica-1.80.0 libgif 5.2.1 : libjpeg 9d : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 1.1.0 : libopenjp2 2.3.1 Found AVX2 Found AVX Found FMA Found SSE

If you are not sure about the path, then simply copy paste the ara.traindata file in the same folder as that of your Python .py file

import pytesseract
from PIL import Image
import os
os.environ["TESSDATA_PREFIX"] = "" # Leaving it empty because file is already copy pasted in the current directory
print(os.getenv("TESSDATA_PREFIX"))
# Copy paste the ara.traineddata file in the same directory as this python code
print(pytesseract.image_to_string(Image.open('cropped.png'), lang="ara"))

For Linux/Ubuntu OS:

sudo apt-get install tesseract-ocr

The validation and run code is same as that of Mac Os

Also make sure the path is fine.

This code works fine if the ara.traineddata file is downloaded successfully:

import pytesseract
from PIL import Image
print(pytesseract.image_to_string(Image.open('cropped.png'), lang="ara"))

You can follow this tutorial for details. Here is the demo output of this tutorial which uses all available languages.

enter image description here

Upvotes: 0

anil kumar
anil kumar

Reputation: 860

There could be multiple problems for this issue.

Check If tesseract.exe is installed. If not get exe file from below link and install the same. Remember the installation path for future reference.

https://github.com/UB-Mannheim/tesseract/wiki

If you already have tesseract installed. But pytesseract is unable to access tesseract using python. You can set the path with in the script like this.

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

Upvotes: 1

Mahrez BenHamad
Mahrez BenHamad

Reputation: 2096

I have stuck at the same problem in the past, I think you have to make sure that you :

  1. Install it from here
  2. Run pip install pytesseract
  3. Adding a new variable called 'tesseract' in environment variables with a value of

    C:\Program Files (x86)\Tesseract-OCR\tesseract.exe

  4. If you run tesseract in the command line should work by giving you usage information

That's it :)

Upvotes: 1

alex_bits
alex_bits

Reputation: 752

You need to tell pytesseract where the tesseract binary is located:

import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'<full_path_to_your_tesseract_executable>'

Doing this should solve your problem

Upvotes: 0

Related Questions