Reputation: 20895
I am trying to teach myself CUDA. This has not been easy so far, but I don't give up easily either :)
I have created a very simple program. It merely returns a value from the GPU.
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import pycuda.autoinit
import numpy as np
returnValue = np.zeros(1)
mod = SourceModule("""
__global__ void myVeryFirstKernel(float* returnValue) {
returnValue[0] = 8.0;
}
""")
func = mod.get_function('myVeryFirstKernel')
func(cuda.InOut(returnValue), block=(1024, 1, 1), grid=(1, 1))
print str(returnValue[0])
However, the value that my program prints is 5.387879938e-315
. That sure doesn't look like 8.0
. Why is the wrong value getting returned from the GPU?
I have tried altering the block size, which I don't think should anything (but who knows). I have also checked that the data type I am sending in (float64) matches my kernel.
Upvotes: 1
Views: 108
Reputation: 72349
You have type conflicts - your kernel is expected a 32 bit single precision value, but you are passing a 64 bit double value to it. If you rewrite your code something like this:
returnValue = np.zeros(1, dtype=np.float32)
mod = SourceModule("""
__global__ void myVeryFirstKernel(float* returnValue) {
returnValue[0] = 8.0f;
}
""")
func = mod.get_function('myVeryFirstKernel')
func(cuda.InOut(returnValue), block=(1024, 1, 1), grid=(1, 1))
print returnValue[0]
so that everything is explicitly specified in single precision, you might have more luck.
Upvotes: 5