Reputation: 41
I need to create a shared library for cuda. The compilation of the library works fine but when I try to use it in my program nvcc returns a linker or ptxas error.
I reduced the problem to the following code. The library must replace different C functions (here: memset). The library consists of three C++ files:
#ifndef FILEA_H_
#define FILEA_H_
namespace A {
__device__
void* memset(void* _in, int _val, int _size);
};
#endif
#include "FileA.h"
__device__
void* A::memset(void* _in, int _val, int _size) {
char* tmp = (char*)_in;
for(int i = 0; i < _size; i++) tmp[i] = _val;
return _in;
}
#ifndef TEMPCLASS_H_
#define TEMPCLASS_H_
#include "FileA.h"
namespace A {
template <typename T>
class TC {
public:
__device__
TC() {
data = new T[10];
}
__device__
~TC(){
delete [] data;
}
__device__
void clear(){
A::memset(data, 0, 10*sizeof(T));
}
T* data;
};
};
#endif
Using the following commands I create a shared library:
nvcc -Xcompiler -fPIC -x cu -rdc=true -c FileA.cpp -o FileA.o
nvcc -Xcompiler -fPIC --shared -o libTestA.so FileA.o -lcudart
This library should be used in a main program:
#include <cuda.h>
#include <TempClass.h>
#include <iostream>
__device__
int doSomthing() {
A::TC<int>* tc = new A::TC<int>();
tc->clear();
for (int i = 0; i < 5; i++) tc->data[i] = i;
int sum = 0;
for (int i = 0; i < 5; i++) sum += tc->data[i];
delete tc;
return sum;
}
__global__
void kernel(int* _res) {
_res[0] = doSomthing();
}
int main(int argc, char** argv) {
int* devVar;
int* hostVar;
hostVar = new int[1];
hostVar[0] = -1;
cudaMalloc(&devVar, sizeof(int));
cudaMemcpy(devVar, hostVar, sizeof(int), cudaMemcpyHostToDevice);
kernel<<< 1, 1>>> (devVar);
cudaMemcpy(hostVar, devVar, sizeof(int), cudaMemcpyDeviceToHost);
std::cout << "kernel done. sum " << *hostVar << std::endl;
return 0;
}
If I try to compile the program with the commands:
nvcc -Xcompiler -fPIC -I. -L. -rdc=true -x cu -c main.cpp -o main.o
nvcc -Xcompiler -fPIC -I. -L. main.o -o main -lTestA
I receive the error message:
nvlink error : Undefined reference to '_ZN1A6memsetEPvii' in 'main.o'
I receive the same error if I try to compile the file directly:
nvcc -Xcompiler -fPIC -I. -L. -rdc=true -x cu main.cpp -o main -lTestA
The command nm libTestA.so
shows that the library contains the function symbol _ZN1A6memsetEPvii.
When I remove the -rdc=true
option while linking I receive a ptxas error:
ptxas fatal : Unresolved extern function '_ZN1A6memsetEPvii'
In my case static linking is no option, I need a shared library. I've also tried to make memset an extern "C" function but this collides with the original C function. The code compiles correctly with g++. Do you have suggestions how to solve this problem.
Upvotes: 4
Views: 4943
Reputation: 151799
It appears that you are attempting to do device-code linking across a library boundary. Currently, that is only possible with a static library.
The options that I am aware of would be to switch to a static library/link arrangement, or else refactor your code so that you do not need to link device code across a dynamic library boundary.
Upvotes: 5