Trouble with function pointers in CUDA

Question

I'm having a very hard time finding the problem in my code. I'm trying to pass a function pointer to the GPU to be executed by the kernel, but for some reason it all fails. Here's my declaration:

typedef void* (*map_func)(void* p);

__device__ void* f(void* param)
{
    int value = (intptr_t)param;
    return (void*)(value * value * value);
}

__device__ map_func d_map = f;

Notice the assignment of f. This is to prove that calling the function pointer with a device function works.

In the kernel I'm calling d_map. On the host I'm declaring and copying a function pointer:

void* square(void* param)
{
    int value = (intptr_t)param;
    return (void*)(value * value);
}

...

map_func h_map = square;    
cudaMemcpyToSymbol(d_map, &h_map, sizeof(map_func));

However, when actually executing the kernel, I get unspecified launch failure, which I suspect means a segfault. I've tested the return code of cudaMemcpyToSymbol and it is success.

In summary: calling d_map if it's pointing to a device function works, but if I try to copy a host function it fails.

I'm sure it's some stupid mistake that I'm unable to spot. Thanks for your help.

Camford · Accepted Answer

This just occured to me, it may worth a try. Declare the functions you want to be passed around as device functions. Give each of them a device function pointer like you've done for f. Call

cudaMemcpyToSymbol(d_map, &, sizeof(d_map), 0, cudaMemcpyDeviceToDevice)

Since you are copying memory allocated on your device, the function pointer size should hopefully match.

This is completely untested. It may kill your card.

Trouble with function pointers in CUDA

Answers (1)

Related Questions