Reputation: 31
I am attempting dynamic parallelism on a GTX 980 ti card. All attempts at running code return "unknown error". Simple code is shown below with compilation options.
I can execute kernels at depth=0
with no issues.
The first time a child is called, the error is given.
The cudaDeviceSynchronize()
were included after looking at other questions here, but didn't solve problem.
Any ideas? Could this be a drivers issue?
Edit 1:
OS: Linux-x86_64
Nvidia driver version: 384.59
nvcc version 7.5.17
There are two 980 ti's connected with PCIe x16 Gen3. The system also has windows installed on another RAID configured SSD.
#include <cuda.h>
#include <fstream>
#include <stdio.h>
#include <stdlib.h>
__global__ void ker_two(){
int two=0;
two++;
}
__global__ void ker_one(){
int one=0;
one++;
ker_two<<<1,1>>>();
cudaDeviceSynchronize();
};
int main( ){
ker_one<<<1,1>>>();
cudaDeviceSynchronize();
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess)
printf("Cuda Error: %s\n", cudaGetErrorString(err));//*/
return 0;
}
compiled with
nvcc -arch=compute_52 -rdc=true -lcudadevrt test.cu
Upvotes: 3
Views: 192
Reputation: 131546
I am able (?) to reproduce the error on a machine with a Maxwell Titan card. It's a Fedora 24 distribution with CUDA 8.0.61 installed manually. Driver version is 375.51.
However - it seems the problem only occurs on my system when I call the cudaDeviceSynchronize()
within the ker_one()
, regardless of whether I call the second kernel or not. So maybe that's the problem you're seeing rather than dynamic parallelism per se.
Considering @talonmies' comment, this might even be just a driver issue.
Upvotes: 1