Reputation: 427
I have written a method that is called from a .cpp file for the purpose of running cudaMemcpy. The method is below:
void copy_to_device(uint32_t *host, uint32_t *device, int size)
{
cudaError_t ret;
ret = cudaMemcpy(device, host, size*sizeof(uint32_t), cudaMemcpyHostToDevice);
if(ret == cudaErrorInvalidValue)
printf("1!\n");
else if(ret == cudaErrorInvalidDevicePointer)
printf("2!\n");
else if(ret == cudaErrorInvalidMemcpyDirection)
printf("3!\n");
}
my .cpp file calls it like this:
uint32_t *input_device;
device_malloc(input_device, INPUT_HEIGHT*INPUT_WIDTH);
uint32_t *oneDinput = TwoDtoOneD(input, INPUT_HEIGHT, INPUT_WIDTH);
copy_to_device(oneDinput, input_device, INPUT_HEIGHT*INPUT_WIDTH);
All that TwoDtoOneD does is take in a 2D array and convert it to a 1D array and return it. Whenever I try and use copy_to_device
method, it returns cudaErrorInvalidValue which isn't well documented on NVIDIA's website. Do you guys happen to know what is wrong with the parameters I am passing to my function that is causing this error? It's causing issues down the road during kernel execution. If you need any more details, please ask.
Here's the method device_malloc
:
void device_malloc(uint32_t *buffer, int size)
{
cudaMalloc((void **) &buffer, size*sizeof(uint32_t));
}
Upvotes: 0
Views: 881
Reputation: 628
The problem is here:
uint32_t *input_device;
device_malloc(input_device, INPUT_HEIGHT*INPUT_WIDTH);
Whatever device_malloc
does, it does not modify the input_device
value. That is, unless the first argument is a reference to pointer, but I am ready to bet it is not.
You need to change the first argument of device_malloc
to a pointer to pointer, and call it like that:
device_malloc(&input_device, INPUT_HEIGHT*INPUT_WIDTH);
Or just have device_malloc
return a pointer to the allocated memory.
To answer your question more directly, cudaMemcpy
returns an error because its first argument, device
, is not a valid device pointer, which CUDA runtime has a way of checking. It probably holds garbage value since you never initialize it due to the above issue.
As a side note and unrelated to the issue, you may want to use cudaGetErrorString
funciton for a more convenient way to print out the status.
Upvotes: 2