cuda racecheck error if using double in kernel

Question

I'm new in cuda coding, recently I met a racecheck error when coding and here is the very simplified code I can repeat the error: (tested on Linux and Windows)

//test.cu:
__global__ void kernel(){

__shared__ double s1;
__shared__ double s2;
__shared__ double s3;
__shared__ double a1;
__shared__ double a2;
__shared__ double a3;

s1=1.0E-3;
s2=1.0E-3;
s3=1.0E-3;
a1=1.0E-3; // line 14
a2=1.0E-3; // line 15
a3=1.0E-3; // line 16

__syncthreads();
}

int main(){

dim3 blockdim(32,32);
kernel<<<1,blockdim>>>();
cudaDeviceReset();
return 0;
}

Then compile the code with: nvcc -G -g test.cu -o test

then test the code with: cuda-memcheck --tool racecheck test

and I got errors like: ========= ERROR: Race reported between Write access at 0x00000350 in test.cu:14:kernel(void) ========= and Write access at 0x00000350 in test.cu:14:kernel(void) [1098 hazards]

This error can be randomly triggered by line 14,15 or 16 if running it repeatedly. The error can be removed If I do any action of the list below:

remove "cudaDeviceReset()"
reduce the threads number in a block by several threads like (16, 32)
change all variable type from "double" to "float"
remove the variables "s1", "s2" and "s3" from the code

This really puzzles me a lot, and leads me to have a naïve question, is every thread trying to execute every sentence in a kernel function? Thanks!

cuda racecheck error if using double in kernel

Answers (0)

Related Questions