CUDA program with slight different results every run

Question

I'm new at CUDA and OpenCL.

I have translated the kernels of a program from CUDA kernels to OpenCL kernels. I'm using the same seeds for the random number generation in both versions. While the OpenCL version gets the exact same results every run, the CUDA version gives a slight different results every run. I'm compiling the CUDA version without -use_fast_math. My device is 1.1 capability. Any idea about what could be the reason?

Thanks in advance

Robert Crovella · Accepted Answer

Devices of compute capability 1.1 do not support double operations. So if you are using double they are getting demoted to float. That could possibly affect your results, although a compute capability 1.1 device cannot support double in OpenCL either, AFAIK.

My question actually is is there any CUDA compiling options that may affect the accuracy of the CUDA results.

Yes, there are a variety of options that affect CUDA's usage of floating point math

I don't know why any of this would lead to variation from one run to the next, however. It's likely that you have a bug in the code.

CUDA program with slight different results every run

Answers (2)

Related Questions