Andy
Andy

Reputation: 61

LibTorch(C++) with Cuda is raising an exception

I am trying to create NN with LibTorch 1.3 and C++ using Cuda 10.1 and Windows 10. For the build I am using Visual Studio 2019.

So far I tried basic examples and MNIST example with CPU which is working. However I cannot run it with CUDA. I tried to move model to GPU as it is described here, but it is not working.

To move your model to GPU memory, you can write model.to(at::kCUDA);. Make sure the inputs to a model are also living in CUDA memory by calling tensor.to(at::kCUDA), which will return a new tensor in CUDA memory.

So I tried just simple

int main(){
    auto net = std::make_shared<Net>();
    net->to(torch::kCUDA); //crashes here
}

Then I tried to move simple Tensors to gpu memory but it crashes as well.

#include <torch/torch.h>

int main() 
{
    torch::Tensor a = torch::ones({ 2, 2 }, torch::requires_grad());
    torch::Tensor b = torch::randn({ 2, 2 });
    a.to(torch::kCUDA);    //Here it crashes
    b.to(torch::kCUDA);    //
    auto c = a + b;
}

and I got:

Exception thrown at 0x00007FFB8263A839 in Resnet50.exe: Microsoft C++ exception: c10::Error at memory location 0x000000E574979F30.
Unhandled exception at 0x00007FFB8263A839 in Resnet50.exe: Microsoft C++ exception: c10::Error at memory location 0x000000E574979F30.

In debug mode it is pointing to KernelBase.dll to

auto operator()(Parameters... args) -> decltype(std::declval<FuncType>()(std::forward<Parameters>(args)...)) {
  return kernel_func_(std::forward<Parameters>(args)...);
}

Using torch::cuda::is_available() shows it can find cuda device.

I don't have much experience with exceptions.

Upvotes: 3

Views: 2618

Answers (1)

Eric Tondelli
Eric Tondelli

Reputation: 113

Hi i had the same problem. I have solved it by installing libtorch version 9.2. I have downloaded the release version from here https://pytorch.org/ Cuda toolkit 9.2 and cudnn 9.2.

I'm using visual Studio 2017.

If you have other cuda version i suggest to uninstall from control panel.

Cuda toolkit https://developer.nvidia.com/cuda-92-download-archive?target_os=Windows&target_arch=x86_64&target_version=10

install cudnn on Windows https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-windows i have version cudnn-9.2-windows10-x64-v7.6.5.32

After i have compile the project with this command

cmake -DCMAKE_PREFIX_PATH=path\to\libtorch -Ax64 .. cmake --build . --config Release

And in my code i am able to do

testModel->to(torch::DeviceType::CUDA);

Be sure to compile in Release

Upvotes: 2

Related Questions