lijin01ictaccn
lijin01ictaccn

Reputation: 1

How can the GPU device function access class objects defined in host functions?

I have an existing C++ program, and I want to migrate it to the GPU version. The kernel function needs to access class objects defined in the host function. For example, a stringstream object will be used in threads. However, it fails to pass the compiling in Cuda. How could the kernel function access this kind of class objects defined in the host functions?

Here is an example.

#include <cstdio>
#include <sstream>

using namespace std;

__global__ void kernel(stringstream * sstr)
{
    printf("%s\n", sstr->str());
}

int main(int argc, char ** argv)
{
    stringstream * sstr;
    cudaMallocManaged(&sstr, sizeof(stringstream));
    *sstr  << "Hello world\n";
    kernel<<<32, 32>>>(sstr);
    cudaDeviceSynchronize();
    cudaFree(sstr);
    return 0;
}

I got the following compile error.

$ nvcc -o bin src.cu
src.cu(8): warning: non-POD class type passed through ellipsis

src.cu(8): error: calling a __host__ function("std::__cxx11::basic_stringstream<char,  ::std::char_traits<char> , std::allocator<char> > ::str const") from a __global__ function("kernel") is not allowed

src.cu(8): error: identifier "std::__cxx11::basic_stringstream<char,  ::std::char_traits<char> , std::allocator<char> > ::str const" is undefined in device code

src.cu(8): error: calling a __host__ function("std::__cxx11::basic_string<char,  ::std::char_traits<char> , std::allocator<char> > ::~basic_string") from a __global__ function("kernel") is not allowed

src.cu(8): error: identifier "std::__cxx11::basic_string<char,  ::std::char_traits<char> , std::allocator<char> > ::~basic_string" is undefined in device code

4 errors detected in the compilation of "/tmp/tmpxft_00003bd0_00000000-8_src.cpp1.ii".

Upvotes: 0

Views: 625

Answers (2)

ppolet
ppolet

Reputation: 168

You should not use C++ std class inside your kernels, as std::stringstream related functions are pre-compiled and linked from your OS, nvcc does not generate the corresponding __device__ functions.

see this topic

Upvotes: 3

amlucas
amlucas

Reputation: 312

std::stringstream probably has a dynamically allocated array inside, which you wont be able to access in your device code; that already makes it to be a bad idea to pass such class to a GPU.

Your compilation fails here because you also attempt to call a __host__ function from device code, which is not possible. You probably need to have a custom stringstream adapted to CUDA if you want that to work.

Upvotes: 0

Related Questions