Reputation: 1110
I have a re-useable function in some CUDA code that needs to be called from both the device and the host. Is there an appropriate qualifier for this?
e.g. what's the correct definition for func1 in this case:
int func1 (int a, int b) {
return a+b;
}
__global__ devicecode (float *A) {
int i = blockDim.x * blockIdx.x + threadIdx.x;
A[i] = func1(i,i);
}
void main() {
// Normal cuda memory set-up
// Call func1 from inside main:
int j = func1(2,4)
// Normal cuda memory copy / program run / retrieve data
}
So far I can only get this to work by having the function twice: once explicitly for the device and once for the host. Is there a better way?
Upvotes: 9
Views: 4285
Reputation: 9769
From the CUDA Programming Guide:
The
__device__
and__host__
qualifiers can be used together however, in which case the function is compiled for both the host and the device.
Upvotes: 18