Reputation: 13
Does cuda somehow block and transfer all allocated managed memory to the GPU when a kernel is launched? I just played with uma and got strange results. At least in my point of view.
I created 2 arrays and send A to kernel, B is untouched by kernel call but it cannot be accessed. The program just crashes when I touch B.
0 0 0 here1
If I comment out the b[0] = 1;
line the code runs fine:
0 0 0 here1 after1 0 here2 1 after2
Why is this happening ?
__global__ void kernel(int* t)
{
t[0]++;
}
int main()
{
int* a;
int* b;
std::cout << cudaMallocManaged(&a,sizeof(int)*100) << std::endl;
std::cout << cudaMallocManaged(&b,sizeof(int)*100) << std::endl;
std::cout << b[0] << std::endl;
kernel<<<1,1,0,0>>>(a);
std::cout << "here1" << std::endl;
b[0] = 1;
std::cout << "after1" << std::endl;
cudaDeviceSynchronize();
std::cout << b[0] << std::endl;
std::cout << "here2" << std::endl;
std::cout << a[0] << std::endl;
std::cout << "after2" << std::endl;
return 0;
}
Upvotes: 1
Views: 558
Reputation: 7255
Does cuda somehow block and transfer all allocated managed memory to the GPU when a kernel is launched?
Yes, provided your device is of compute capability less than 6.0.
On these devices managed memory works by copying all managed memory to the GPU before kernel launch, and copying all managed memory back to the host on synchronization. During that timespan, accessing managed memory from the host will lead to a segmentation fault.
You can be more specific about which memory to copy for a given kernel by attaching it to a stream using cudaStreamAttachMemAsync()
and launching the kernel into that stream.
Upvotes: 2