F.P An
F.P An

Reputation: 9

cuda racecheck error if using double in kernel

I'm new in cuda coding, recently I met a racecheck error when coding and here is the very simplified code I can repeat the error: (tested on Linux and Windows)

//test.cu:
__global__ void kernel(){

__shared__ double s1;
__shared__ double s2;
__shared__ double s3;
__shared__ double a1;
__shared__ double a2;
__shared__ double a3;

s1=1.0E-3;
s2=1.0E-3;
s3=1.0E-3;
a1=1.0E-3; // line 14
a2=1.0E-3; // line 15
a3=1.0E-3; // line 16

__syncthreads();
}

int main(){

dim3 blockdim(32,32);
kernel<<<1,blockdim>>>();
cudaDeviceReset();
return 0;
}

Then compile the code with: nvcc -G -g test.cu -o test

then test the code with: cuda-memcheck --tool racecheck test

and I got errors like: ========= ERROR: Race reported between Write access at 0x00000350 in test.cu:14:kernel(void) ========= and Write access at 0x00000350 in test.cu:14:kernel(void) [1098 hazards]

This error can be randomly triggered by line 14,15 or 16 if running it repeatedly. The error can be removed If I do any action of the list below:

  1. remove "cudaDeviceReset()"
  2. reduce the threads number in a block by several threads like (16, 32)
  3. change all variable type from "double" to "float"
  4. remove the variables "s1", "s2" and "s3" from the code

This really puzzles me a lot, and leads me to have a naïve question, is every thread trying to execute every sentence in a kernel function? Thanks!

Upvotes: 0

Views: 52

Answers (0)

Related Questions