Face_recognition python: Using dlib gives code: 98, reason: invalid device function

Question

I am trying to run face_recognition on my GeForce GT 730 GPU, 11.3 cudatoolkit, 8.2.1 cuDNN, and 475.14 driver on windows 10 (my device compute capability is 3.5). I am getting the error message

Error while calling cudaOccupancyMaxPotentialBlockSize(&num_blocks,&num_threads,K) in file c:\users\zachb\anaconda3\envs\dlib5\dlib\dlib\cuda\cuda_utils.h:186. code: 98, reason: invalid device function

When Executing: face_encodings = face_recognition.face_encodings(images, face_locations)

I initially tried to install CUDA toolkit and cuDNN from the nvidia archives and download the folders from cuDNN into the toolkit directory. However, doing such always led to the error message

ImportError: DLL load failed: The specified module could not be found.

when importing dlib. This problem was solved when I would install cudatoolkit and cudnn from conda however. This led me to uninstall cuda from the system and install it into me python environment using the nvidia conda channel.

This is currently what I am running for the entire installation for my environment

I am trying to mimick this https://gist.github.com/nguyenhoan1988/ed92d58054b985a1b45a521fcf8fa781

cd C:\Users\zachb\anaconda3\envs

conda create --name dlib python=3.8 cmake ipython

cd C:\Users\zachb\anaconda3\envs\dlib

conda install conda-forge::cudnn=8.2.1 conda-forge::cudatoolkit=11.3 cuda -c nvidia/label/cuda-11.3.0 -c nvidia/label/cuda-11.3.1

git clone https://github.com/davisking/dlib.git

cd dlib

mkdir build

cd build

cmake .. -G "Visual Studio 15 2017" -A x64 -DDLIB_USE_CUDA=1 -DDLIB_USE_CUDA_COMPUTE_CAPABILITIES="35" -DUSE_AVX_INSTRUCTIONS=1 -DCUDAToolkit_ROOT="C:\Users\zachb\anaconda3\envs\dlib\bin"

cmake --build .

cd ..

python setup.py install --set DLIB_USE_CUDA=1

conda install numpy matplotlib Pillow

conda install click=8.1.7 colorama=0.4.6 face-recognition=1.3.0 face_recognition_models=0.3.0 --no-dep

When I print(dlib.DLIB_USE_CUDA) it returns true I have also tried installing cuda into my environment with conda and copying cuDNN into my envirnment dir (gave the same import dlib error) I have also tried CUDA 11.2, 11.3, 11.4, 11.5, and 11.8 (with appropriate cuDNNs) and keep getting this error.

I Referenced these to help determine requirements: https://docs.nvidia.com/cuda/archive/11.8.0/cuda-toolkit-release-notes/index.html https://quasar.ugent.be/files/doc/cuda-msvc-compatibility.html https://en.wikipedia.org/wiki/CUDA#GPUs_supported https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-896/support-matrix/index.html https://docs.nvidia.com/deploy/cuda-compatibility/

EDIT I got it to work. I edited the setup.py file after running the cmake make commands. I also system installed CUDA 10.2 and cuDNN 7.6.5. The changes I made to the file are adding -DDLIB_USE_CUDA=1, -DDLIB_USE_CUDA_COMPUTE_CAPABILITIES=35, and -DUSE_AVX_INSTRUCTIONS=1 to cmake_args.

cmake_args = ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + extdir,
                      '-DPYTHON_EXECUTABLE=' + sys.executable,
                      '-DDLIB_USE_FFMPEG=OFF',
                      '-DDLIB_USE_CUDA=1',
                      '-DDLIB_USE_CUDA_COMPUTE_CAPABILITIES=35',
                      '-DUSE_AVX_INSTRUCTIONS=1']

However, I have come across a new error where running

face_recognition.face_locations(img, model="cnn")

now produces the error

RuntimeError: Error while calling cudnnConvolutionBiasActivationForward( context(), &alpha1, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &alpha2, out_desc, out, descriptor(biases), biases.device(), use_relu ? relu_activation_descriptor() : identity_activation_descriptor(), out_desc, out) in file C:\Users\zachb\anaconda3\envs\dlib\dlib\dlib\cuda\cudnn_dlibapi.cpp:1253. code: 9, reason: CUDNN_STATUS_NOT_SUPPORTED

I think I found a solution at https://forums.developer.nvidia.com/t/weird-error-runtimeerror-error-while-calling-cudnnconvolutionforward-dlib-cuda-cudnn-dlibapi-cpp-1007-code-7-reason-a-call-to-cudnn-failed/108595/3 saying that the error comes from the cuDNN version I am using.

Also it is important to note that I can fully execute all the necessary dlib commands to get encodings and locations now with the changes I made.

Face_recognition python: Using dlib gives code: 98, reason: invalid device function

Answers (0)

Related Questions