Reputation: 41
In order to use unified memory feature in CUDA 6, the following requirement must be meet,
My setup is,
The sample code are taken from the programming guide page 210.
__device__ __managed__ int ret[1000];
__global__ void AplusB(int a, int b) {
ret[threadIdx.x] = a + b + threadIdx.x;
}
int main() {
AplusB<<< 1, 1000 >>>(10, 100);
cudaDeviceSynchronize();
for(int i=0; i<1000; i++)
printf("%d: A+B = %d\n", i, ret[i]);
return 0;
}
The nvcc compile option I used is,
nvcc -m64 -Xptxas=-Werror -arch=compute_30 -code=sm_30 -o UM UnifiedMem.cu
This code compiles perfectly fine. During execution, it produces "segmentation fault" at printf(). It feels like that unified memory feature didn't come into effect. The address of variable ret is still of GPU but printf is called on CPU. CPU is trying to access a piece of data that is not allocated on CPU so it produces a segmentation fault. Can anybody help me? What is wrong here?
Upvotes: 3
Views: 1433
Reputation: 37904
Thought I am not certain sure (and I can't check it for myself right now) I think that because Ubuntu 13.10 has gcc
in version of 4.8.1, which I believe is not supported yet even in newest CUDA Toolkit 6.0. Try to compile your code with host compiler gcc
4.7.3 (that is, the same one that is included in officially supported Ubuntu 13.04 for default). For that you might install gcc-4.7
package and point /usr/bin/gcc-4.7
as host compiler for nvcc
. For C++ support I believe you need g++-4.7
as well.
If you need some simple step-by-step guide, then you might proceed with http://n00bsys0p.co.uk/blog/2014/01/23/nvidia-cuda-55ubuntu-1310-saucy-salamander. It's for CUDA Toolkit 5.5, but I think it should be relevant for recent version as well.
Upvotes: 1