Cuda global __device__ variable auto initialization

Question

I'm declaring a global variable myvar on the device using the __device__ specifier. I don't set it to a meaningful value anywhere (not using cudaMemcpyToSymbol in my kernel launch method, as you would normally do).

I'd expect the value of myvar to be random garbage, but it's neatly 0.0 every time. Does CUDA do auto-initialisation of device variables?

I've checked it using the CUDA debugger also, the value is effectively 0.

__device__ float myvar;

__global__ void kernel(){
    printf("my var: %f", myvar);
}

int kernel_launch(){
    kernel<<<1,5>>>();
    cudaDeviceSynchronize();
   return 0;
}

Roger Dahl · Accepted Answer

CUDA does not automatically initialize any variables. It's just a CUDA implementation based coincidence that myvar becomes zero in your test app.

In IEEE-754 floating point (used by NVIDIA GPUs), an all zero pattern corresponds to 0.0, so it's a much more likely "random" value than, say, 1.0f.

Don't infer the values of all your GPU memory based on the value in that single word...

I did a small experiment and was slightly surprised by the result though. I initialized myvar with __device__ float myvar(1.1f); and altered the printf() so that it prints both the value and the address of the variable. Then I ran it, got 1.1f output and noted the address. Then I removed the initialization and ran it again. This time, the value went back to 0.0f while the address stayed the same, showing that the chunk of memory in which this variable is located does get zeroed out as part of regular CUDA operations. For instance, this could happen if the CUDA program is copied to the GPU within a fixed size chunk in which the other data is zero, and myvar is assigned to an address within this chunk.

Cuda global device variable auto initialization

Answers (2)

Related Questions