Reputation: 1
I have an existing C++ program, and I want to migrate it to the GPU version. The kernel function needs to access class objects defined in the host function. For example, a stringstream
object will be used in threads. However, it fails to pass the compiling in Cuda. How could the kernel function access this kind of class objects defined in the host functions?
Here is an example.
#include <cstdio>
#include <sstream>
using namespace std;
__global__ void kernel(stringstream * sstr)
{
printf("%s\n", sstr->str());
}
int main(int argc, char ** argv)
{
stringstream * sstr;
cudaMallocManaged(&sstr, sizeof(stringstream));
*sstr << "Hello world\n";
kernel<<<32, 32>>>(sstr);
cudaDeviceSynchronize();
cudaFree(sstr);
return 0;
}
I got the following compile error.
$ nvcc -o bin src.cu
src.cu(8): warning: non-POD class type passed through ellipsis
src.cu(8): error: calling a __host__ function("std::__cxx11::basic_stringstream<char, ::std::char_traits<char> , std::allocator<char> > ::str const") from a __global__ function("kernel") is not allowed
src.cu(8): error: identifier "std::__cxx11::basic_stringstream<char, ::std::char_traits<char> , std::allocator<char> > ::str const" is undefined in device code
src.cu(8): error: calling a __host__ function("std::__cxx11::basic_string<char, ::std::char_traits<char> , std::allocator<char> > ::~basic_string") from a __global__ function("kernel") is not allowed
src.cu(8): error: identifier "std::__cxx11::basic_string<char, ::std::char_traits<char> , std::allocator<char> > ::~basic_string" is undefined in device code
4 errors detected in the compilation of "/tmp/tmpxft_00003bd0_00000000-8_src.cpp1.ii".
Upvotes: 0
Views: 625
Reputation: 168
You should not use C++ std class inside your kernels, as std::stringstream related functions are pre-compiled and linked from your OS, nvcc does not generate the corresponding __device__
functions.
see this topic
Upvotes: 3
Reputation: 312
std::stringstream
probably has a dynamically allocated array inside, which you wont be able to access in your device code; that already makes it to be a bad idea to pass such class to a GPU.
Your compilation fails here because you also attempt to call a __host__
function from device code, which is not possible.
You probably need to have a custom stringstream
adapted to CUDA if you want that to work.
Upvotes: 0