Reputation: 62469
I'm having a very hard time finding the problem in my code. I'm trying to pass a function pointer to the GPU to be executed by the kernel, but for some reason it all fails. Here's my declaration:
typedef void* (*map_func)(void* p);
__device__ void* f(void* param)
{
int value = (intptr_t)param;
return (void*)(value * value * value);
}
__device__ map_func d_map = f;
Notice the assignment of f
. This is to prove that calling the function pointer with a device function works.
In the kernel I'm calling d_map
. On the host I'm declaring and copying a function pointer:
void* square(void* param)
{
int value = (intptr_t)param;
return (void*)(value * value);
}
...
map_func h_map = square;
cudaMemcpyToSymbol(d_map, &h_map, sizeof(map_func));
However, when actually executing the kernel, I get unspecified launch failure
, which I suspect means a segfault. I've tested the return code of cudaMemcpyToSymbol
and it is success.
In summary: calling d_map
if it's pointing to a device function works, but if I try to copy a host function it fails.
I'm sure it's some stupid mistake that I'm unable to spot. Thanks for your help.
Upvotes: 1
Views: 1016
Reputation: 780
This just occured to me, it may worth a try. Declare the functions you want to be passed around as device functions. Give each of them a device function pointer like you've done for f
. Call
cudaMemcpyToSymbol(d_map, &<function of choice>, sizeof(d_map), 0, cudaMemcpyDeviceToDevice)
Since you are copying memory allocated on your device, the function pointer size should hopefully match.
This is completely untested. It may kill your card.
Upvotes: 1