Reputation: 14877
I am having trouble playing with GPU support in OpenCV 4.4.0 downloaded from https://opencv.org/releases/. The changelog has 23 hits for CUDA
. The last one in under 4.4.0. Yet when I run a simple C++ sample that does:
cv::cuda::getCudaEnabledDeviceCount()
I get a ZERO. I have then tried to call setDevice(0)
as mentioned here and the result was the following exception.
OpenCV(4.4.0) C:\build\master_winpack-build-win64-vc15\opencv\modules\core\include\opencv2/core/private.cuda.hpp:106: error: (-216:No CUDA support) The library is compiled without CUDA support in function 'throw_no_cuda'
This seems to indicate there is NO CUDA support in the pre-built OpenCV. Must I build it myself? Are there trusted Windows binaries with CUDA support?
I have even installed the latest CUDA Toolkit (available at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0
) and made sure to update system environment variables but to no avail.
Yet, even before installing the CUDA Toolkit, I was able to run opencv_version_win32.exe
from opencv\build\x64\vc15\bin
which gave me the (relevant) output below.
Is my card GeForce GTX 980M
too old? Are there some additional setup steps I need to take before invoking cv::cuda::getCudaEnabledDeviceCount()
?
OpenCL Platforms:
NVIDIA CUDA
dGPU: GeForce GTX 980M (OpenCL 1.2 CUDA)
Current OpenCL device:
Type = dGPU
Name = GeForce GTX 980M
Version = OpenCL 1.2 CUDA
Driver version = 451.67
Address bits = 64
Compute units = 12
Max work group size = 1024
Local memory size = 48 KB
Max memory allocation size = 2 GB
Double support = Yes
Host unified memory = No
Device extensions:
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_fp64
cl_khr_byte_addressable_store
cl_khr_icd
cl_khr_gl_sharing
cl_nv_compiler_options
cl_nv_device_attribute_query
cl_nv_pragma_unroll
cl_nv_d3d10_sharing
cl_khr_d3d10_sharing
cl_nv_d3d11_sharing
cl_nv_copy_opts
cl_nv_create_buffer
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
Has AMD Blas = No
Has AMD Fft = No
Preferred vector width char = 1
Preferred vector width short = 1
Preferred vector width int = 1
Preferred vector width long = 1
Preferred vector width float = 1
Preferred vector width double = 1
OpenCV's HW features list:
ID= 1 (MMX) -> ON
ID= 2 (SSE) -> ON
ID= 3 (SSE2) -> ON
ID= 4 (SSE3) -> ON
ID= 5 (SSSE3) -> ON
ID= 6 (SSE4.1) -> ON
ID= 7 (SSE4.2) -> ON
ID= 8 (POPCNT) -> ON
ID= 9 (FP16) -> ON
ID= 10 (AVX) -> ON
ID= 11 (AVX2) -> ON
ID= 12 (FMA3) -> ON
Total available: 12
Parallel framework: ms-concurrency (nthreads=8)
Here is the output of cv:: getBuildInformation()
General configuration for OpenCV 4.4.0 =====================================
Version control: 4.4.0
Platform:
Timestamp: 2020-07-17T22:58:08Z
Host: Windows 10.0.18363 AMD64
CMake: 3.16.4
CMake generator: Visual Studio 15 2017
CMake build tool: C:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/MSBuild/15.0/Bin/MSBuild.exe
MSVC: 1916
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (15 files): + SSSE3 SSE4_1
SSE4_2 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (0 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (4 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (29 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (4 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe (ver 19.16.27042.0)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP2 /MD /O2 /Ob2 /DNDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP2 /MDd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio/2017/Professional/VC/Tools/MSVC/14.16.27023/bin/Hostx86/x64/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP2 /MD /O2 /Ob2 /DNDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP2 /MDd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:x64 /INCREMENTAL:NO
Linker flags (Debug): /machine:x64 /debug /INCREMENTAL
ccache: NO
Precompiled headers: NO
Extra dependencies:
3rdparty dependencies:
OpenCV modules:
To be built: calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo stitching video videoio world
Disabled: python2 python3
Disabled by dependency: -
Unavailable: java js ts
Applications: apps
Documentation: NO
Non-free algorithms: NO
Windows RT support: NO
GUI:
Win32 UI: YES
VTK support: NO
Media I/O:
ZLib: build (ver 1.2.11)
JPEG: build-libjpeg-turbo (ver 2.0.5-62)
WEBP: build (ver encoder: 0x020f)
PNG: build (ver 1.6.37)
TIFF: build (ver 42 - 4.0.10)
JPEG 2000: build Jasper (ver 1.900.1)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
Video I/O:
DC1394: NO
FFMPEG: YES (prebuilt binaries)
avcodec: YES (58.54.100)
avformat: YES (58.29.100)
avutil: YES (56.31.100)
swscale: YES (5.5.100)
avresample: YES (4.0.0)
GStreamer: NO
DirectShow: YES
Media Foundation: YES
DXVA: YES
Parallel framework: Concurrency
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2020.0.0 Gold [2020.0.0]
at: C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2020.0.0)
at: C:/build/master_winpack-build-win64-vc15/build/3rdparty/ippicv/ippicv_win/iw
Eigen: NO
Custom HAL: NO
Protobuf: build (3.5.1)
OpenCL: YES (NVD3D11)
Include path: C:/build/master_winpack-build-win64-vc15/opencv/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Python (for build): C:/utils/soft/python27-x64/python.exe
Java:
ant: C:/utils/soft/apache-ant-1.9.7/bin/ant.bat (ver 1.9.7)
JNI: C:/Program Files/Java/jdk1.8.0_112/include C:/Program Files/Java/jdk1.8.0_112/include/win32 C:/Program Files/Java/jdk1.8.0_112/include
Java wrappers: NO
Java tests: NO
Install to: C:/build/master_winpack-build-win64-vc15/install
-----------------------------------------------------------------
Upvotes: 1
Views: 3640
Reputation: 72
Try to build your own libraries with CMAKE. I am not sure if prebuilt binaries have CUDA setting enabled as default. With the CMAKE installation, customizing the library for a given computer environment is much better.
You can find some manuals on youtube.
https://www.youtube.com/watch?v=TT3_dlPL4vo
https://jamesbowley.co.uk/build-opencv-4-0-0-with-cuda-10-0-and-intel-mkl-tbb-in-windows/
Upvotes: 2