Reputation: 23
I've tried to install OCR and PyTesseract using pip for my internship task. My Python version is 3.7, I have MSVC 2019 and 2017 (for my CUDA and CuDNN)
pip install tesserocr
it gives me error
Using cached https://files.pythonhosted.org/packages/e3/77/fb26b321c3b9ce4a47af12b19e85ddbf4d0629adb6552d85276e824e6e51/tesserocr-2.5.0.tar.gz
Building wheels for collected packages: tesserocr
Building wheel for tesserocr (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\user\AppData\Local\Temp\pip-wheel-_68w2jz0' --python-tag cp35
cwd: C:\Users\user\AppData\Local\Temp\pip-install-r0bpakzy\tesserocr\
Complete output (21 lines):
Failed to extract tesseract version number from: tesseract v5.0.0-alpha.20191030
leptonica-1.78.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
Supporting tesseract v3.04.00
Building with configs: {'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}, 'libraries': ['tesseract', 'lept']}
running bdist_wheel
running build
running build_ext
building 'tesserocr' extension
creating build
creating build\temp.win-amd64-3.5
creating build\temp.win-amd64-3.5\Release
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MT -Ic:\users\user\anaconda3\envs\tensorflow2\include -Ic:\users\user\anaconda3\envs\tensorflow2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\Include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" /EHsc /Tptesserocr.cpp /Fobuild\temp.win-amd64-3.5\Release\tesserocr.obj
tesserocr.cpp
tesserocr.cpp(634): fatal error C1083: Cannot open include file: 'leptonica/allheaders.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Failed building wheel for tesserocr
Running setup.py clean for tesserocr
Failed to build tesserocr
Installing collected packages: tesserocr
Running setup.py install for tesserocr ... error
ERROR: Command errored out with exit status 1:
command: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\user\AppData\Local\Temp\pip-record-wj94aq7v\install-record.txt' --single-version-externally-managed --compile
cwd: C:\Users\user\AppData\Local\Temp\pip-install-r0bpakzy\tesserocr\
Complete output (21 lines):
Failed to extract tesseract version number from: tesseract v5.0.0-alpha.20191030
leptonica-1.78.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
Supporting tesseract v3.04.00
Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}}
running install
running build
running build_ext
building 'tesserocr' extension
creating build
creating build\temp.win-amd64-3.5
creating build\temp.win-amd64-3.5\Release
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MT -Ic:\users\user\anaconda3\envs\tensorflow2\include -Ic:\users\user\anaconda3\envs\tensorflow2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\Include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\Include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt" /EHsc /Tptesserocr.cpp /Fobuild\temp.win-amd64-3.5\Release\tesserocr.obj
tesserocr.cpp
tesserocr.cpp(634): fatal error C1083: Cannot open include file: 'leptonica/allheaders.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX64\\x64\\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\users\user\anaconda3\envs\tensorflow2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-r0bpakzy\\tesserocr\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\user\AppData\Local\Temp\pip-record-wj94aq7v\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.
For PyTesseract, the command is like below
pip install tesserocr-2.4.0-cp36-cp36m-win_amd64.whl
The error is below
WARNING: Requirement 'tesserocr-2.4.0-cp36-cp36m-win_amd64.whl' looks like a filename, but the file does not exist
ERROR: tesserocr-2.4.0-cp36-cp36m-win_amd64.whl is not a supported wheel on this platform.
Anyone has any idea on how can I download OCR that works well with Python?
Upvotes: 1
Views: 2226
Reputation: 139
Your Python version is 3.7
But you are trying to install tesserocr-2.4.0-cp36-cp36m-win_amd64.whl(which is applicable to python version 3.6)
Try to install tesserocr specific to installed Python version (python 3.7) 'tesserocr-2.4.0-cp37-cp37m-win_amd64.whl'
Check for your OS supported versions https://github.com/simonflueckiger/tesserocr-windows_build/releases
Upvotes: 1