Reputation: 1433
Objective is to call a device function available in another file, when i compile the global kernel it shows the following error *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)*.
Problematic Code (not the full code but where the problem arises), cat norm.h
# ifndef NORM_H_
# define NORM_H_
# include<stdio.h>
__device__ double invcdf(double prob, double mean, double stddev);
#endif
cat norm.cu
# include <norm.h>
__device__ double invcdf(double prob, double mean, double stddev) {
return (mean + stddev*normcdfinv(prob));
}
cat test.cu
# include <norm.h>
# include <curand.h>
# include <curand_kernel.h>
__global__ void phase2Kernel(double* out_profit, struct strategyHolder* strategy) {
curandState seedValue;
curand_init(threadIdx.x, 0, 0, &seedValue);
double randomD = invcdf(curand_uniform_double( &seedValue ), 300, 80);
}
nvcc -c norm.cu -o norm.o -I"."
nvcc -c test.cu -o test.o -I"."
Upvotes: 5
Views: 2246
Reputation: 21108
You're trying to do separate compilation, which needs some special command line options. See the NVCC manual for details, but here's how to get your example to compile. I've targeted sm_20, but you can target sm_20 or later depending on what GPU you have. Separate compilation is not possible on older devices (sm_1x).
__device__
function as extern
in your header file, but if you have any static device variables they will need to be declared as extern
Generate relocatable code for the device by compiling as shown below (-dc
is the device equivalent of -c
, see the manual for more information)
nvcc -arch=sm_20 -dc norm.cu -o norm.o -I.
nvcc -arch=sm_20 -dc test.cu -o test.o -I.
Link the device parts of the code by calling nvlink before the final host link
nvlink -arch=sm_20 norm.o test.o -o final.o
Upvotes: 7