Aqiff M
Aqiff M

Reputation: 23

Error installing TesserOcr and PyTesseract using pip

I've tried to install OCR and PyTesseract using pip for my internship task. My Python version is 3.7, I have MSVC 2019 and 2017 (for my CUDA and CuDNN)

pip install tesserocr

it gives me error

Using cached https://files.pythonhosted.org/packages/e3/77/fb26b321c3b9ce4a47af12b19e85ddbf4d0629adb6552d85276e824e6e51/tesserocr-2.5.0.tar.gz
Building wheels for collected packages: tesserocr
  Building wheel for tesserocr (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\user\AppData\Local\Temp\pip-wheel-_68w2jz0' --python-tag cp35
       cwd: C:\Users\user\AppData\Local\Temp\pip-install-r0bpakzy\tesserocr\
  Complete output (21 lines):
  Failed to extract tesseract version number from: tesseract v5.0.0-alpha.20191030
   leptonica-1.78.0
    libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
   Found AVX2
   Found AVX
   Found FMA
   Found SSE
   Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
  Supporting tesseract v3.04.00
  Building with configs: {'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}, 'libraries': ['tesseract', 'lept']}
  running bdist_wheel
  running build
  running build_ext
  building 'tesserocr' extension
  creating build
  creating build\temp.win-amd64-3.5
  creating build\temp.win-amd64-3.5\Release
  C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MT -Ic:\users\user\anaconda3\envs\tensorflow2\include -Ic:\users\user\anaconda3\envs\tensorflow2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\Include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" /EHsc /Tptesserocr.cpp /Fobuild\temp.win-amd64-3.5\Release\tesserocr.obj
  tesserocr.cpp
  tesserocr.cpp(634): fatal error C1083: Cannot open include file: 'leptonica/allheaders.h': No such file or directory
  error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\cl.exe' failed with exit status 2
  ----------------------------------------
  ERROR: Failed building wheel for tesserocr
  Running setup.py clean for tesserocr
Failed to build tesserocr
Installing collected packages: tesserocr
    Running setup.py install for tesserocr ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\user\AppData\Local\Temp\pip-record-wj94aq7v\install-record.txt' --single-version-externally-managed --compile
         cwd: C:\Users\user\AppData\Local\Temp\pip-install-r0bpakzy\tesserocr\
    Complete output (21 lines):
    Failed to extract tesseract version number from: tesseract v5.0.0-alpha.20191030
     leptonica-1.78.0
      libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
     Found AVX2
     Found AVX
     Found FMA
     Found SSE
     Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
    Supporting tesseract v3.04.00
    Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}}
    running install
    running build
    running build_ext
    building 'tesserocr' extension
    creating build
    creating build\temp.win-amd64-3.5
    creating build\temp.win-amd64-3.5\Release
    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MT -Ic:\users\user\anaconda3\envs\tensorflow2\include -Ic:\users\user\anaconda3\envs\tensorflow2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\Include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" /EHsc /Tptesserocr.cpp /Fobuild\temp.win-amd64-3.5\Release\tesserocr.obj
    tesserocr.cpp
    tesserocr.cpp(634): fatal error C1083: Cannot open include file: 'leptonica/allheaders.h': No such file or directory
    error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\cl.exe' failed with exit status 2
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\user\AppData\Local\Temp\pip-record-wj94aq7v\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.

For PyTesseract, the command is like below

pip install tesserocr-2.4.0-cp36-cp36m-win_amd64.whl

The error is below

WARNING: Requirement 'tesserocr-2.4.0-cp36-cp36m-win_amd64.whl' looks like a filename, but the file does not exist
ERROR: tesserocr-2.4.0-cp36-cp36m-win_amd64.whl is not a supported wheel on this platform.

Anyone has any idea on how can I download OCR that works well with Python?

Upvotes: 1

Views: 2226

Answers (1)

JSH
JSH

Reputation: 139

Your Python version is 3.7

But you are trying to install tesserocr-2.4.0-cp36-cp36m-win_amd64.whl(which is applicable to python version 3.6)

Try to install tesserocr specific to installed Python version (python 3.7) 'tesserocr-2.4.0-cp37-cp37m-win_amd64.whl'

Check for your OS supported versions https://github.com/simonflueckiger/tesserocr-windows_build/releases

Upvotes: 1

Related Questions