"exited with code 255" when trying to call __device__ function within __global__ function

Question

I have the following test.hpp which declares test():

#pragma once
#include "cuda_runtime.h"
#include "device_launch_parameters.h"

__host__ __device__ void test();

and test.cpp which defines test():

#include "test.hpp"

__host__ __device__ void test() { }

The following kernel.cu fails to compile (with exit code 255, and no other info):

#include "test.hpp"

__global__ 
void gpu(int x)
{
    test(); // compiles just fine if I comment out this line
}

int main()
{
    // can be called multiple times from host with no problems
    test();
    test();
    test();

    return 0;
}

Like the comment states, if I remove the test() call from the gpu function, then the code compiles and runs without error.

Why is this? How can I fix it?

Edit: I should mention that my environment and compilation commands are correct, I managed to compile many of the sample projects without issues.

Gaberocksall · Accepted Answer

A comment by @Robert Crovella set me on the right track to solving this issue.

I moved test.cpp into test.cu, and test.hpp to test.cuh.

Then, I was able to enable separable compilation and device code linking by following these answers:

https://stackoverflow.com/a/31006889/9816919

https://stackoverflow.com/a/63431536/9816919

"exited with code 255" when trying to call device function within global function

Answers (1)

Related Questions

&quot;exited with code 255&quot; when trying to call __device__ function within __global__ function

Answers (1)

Related Questions

"exited with code 255" when trying to call device function within global function