Reputation: 547
When compiling a static library with multiple.h and .cu files I get an unresolved extern function. Here is a short example that replicates the error.
It appears that I can't get Nsight Eclipse Edition to compile extrafunctions.cu first. In my full project the file with extra functions is compiled first but it still throws the unable to resolve external function error.
Here's the output for this sample:
**** Build of configuration Debug for project linkerror ****
make all
Building file: ../cudatest.cu
Invoking: NVCC Compiler
nvcc -I/usr/local/cuda/include -G -g -O0 -gencode arch=compute_30,code=sm_30 -odir "" -M -o "cudatest.d" "../cudatest.cu"
nvcc --compile -G -I/usr/local/cuda/include -O0 -g -gencode arch=compute_30,code=compute_30 -gencode arch=compute_30,code=sm_30 -x cu -o "cudatest.o" "../cudatest.cu"
../cudatest.cu(19): warning: variable "devInts" is used before its value is set
../cudatest.cu(19): warning: variable "devInts" is used before its value is set
ptxas fatal : Unresolved extern function '_Z9incrementi'
make: *** [cudatest.o] Error 255
**** Build Finished ****
cudatest.h:
#ifndef CUDAPATH_H_
#define CUDAPATH_H_
#include <cuda.h>
#include <cuda_runtime.h>
#include "extrafunctions.h"
void test();
#endif /* CUDAPATH_H_ */
cudatest.cu:
#include <cuda.h>
#include <cuda_runtime.h>
#include "extrafunctions.h"
__global__ void kernel(int* devInts){
int tid = threadIdx.x + (blockDim.x*blockIdx.x);
if (tid == 0){
for(int i = 0; i < NUMINTS; i++){
devInts[i] = increment(devInts[i]);
}
}
}
void test(){
int* myInts = (int*)malloc(NUMINTS * sizeof(int));
int* devInts;
cudaMemcpy((void**)devInts, myInts, NUMINTS*sizeof(int), cudaMemcpyHostToDevice);
kernel<<<1,1>>>(devInts);
int* outInts = (int*)malloc(NUMINTS * sizeof(int));
cudaFree(devInts);
free(myInts);
free(outInts);
}
extrafunctions.h:
#ifndef EXTRAFUNCTIONS_H_
#define EXTRAFUNCTIONS_H_
#include <cuda.h>
#include <cuda_runtime.h>
#define NUMINTS 4
int __device__ increment(int i);
#endif /* EXTRAFUNCTIONS_H_ */
extrafunctions.cu:
#include <cuda.h>
#include <cuda_runtime.h>
#include "extrafunctions.h"
int __device__ increment(int i){
return i+1;
}
Upvotes: 1
Views: 2706
Reputation: 9474
You need to explicitly enable separate compilation for this to work. Right-click your project, "Properties", Build->CUDA and select "Separate compilation" linker mode.
Please note that separate compilation only works on SM 2.0+ GPUs and can only emit SASS (e.g. it is not possible to emit PTX that will be compatible with future CUDA devices). For more information please read "Using Separate Compilation in CUDA" in NVCC manual.
Update You need to use NVCC linker to link device code, that is why GCC linker fails. In Nsight you can either link the whole application using NVCC or setup a static library project that contains all CUDA code and is built with NVCC tollchain and a regular C/C++ project that uses GCC and links with the static library produced from the first project.
Upvotes: 5