wotopul
wotopul

Reputation: 125

CUDA __host__ __device__ variables

In CUDA function type qualifiers __device__ and __host__ can be used together in which case the function is compiled for both the host and the device. This allows to eliminate copy-paste. However, there is no such thing as __host__ __device__ variable. I'm looking for an elegant way to do something like this:

__host__ __device__ const double common = 1.0;

__host__ __device__ void foo() {
    ... access common
}

__host__ __device__ void bar() {
    ... access common
}

I found out that the following code complies and runs without errors. (all results were obtained on Ubuntu 14.04 with CUDA 7.5 and gcc 4.8.4 as a host compiler)

#include <iostream>

__device__ const double off = 1.0;

__host__ __device__ double sum(int a, int b) {
    return a + b + off;
}

int main() {
    double res = sum(1, 2);
    std::cout << res << std::endl;
    cudaDeviceReset();
    return 0;
}

$ nvcc main.cu -o main && ./main
4

In fact, nvcc --cuda main.cu translates cu-file into this:

...
static const double off = (1.0);
# 5 "main.cu"
double sum(int a, int b) {
# 6 "main.cu"
return (a + b) + off;
# 7 "main.cu"
}
# 9 "main.cu"
int main() {
# 10 "main.cu"
double res = sum(1, 2);
# 11 "main.cu"
(((std::cout << res)) << (std::endl));
# 12 "main.cu"
cudaDeviceReset();
# 13 "main.cu"
return 0;
# 14 "main.cu"
}
...

But, no surprise, if the variable off is declared without const qualifier (__device__ double off = 1.0) I get the following output:

$ nvcc main.cu -o main && ./main
main.cu(7): warning: a __device__ variable "off" cannot be directly read in a host function

3

So, returning back to the original question, can I rely on this behavior with global __device__ const variable? If not, what are the other options?

UPD By the way, the above behavior doesn't reproduce on Windows.

Upvotes: 4

Views: 12135

Answers (1)

Robert Crovella
Robert Crovella

Reputation: 151879

For ordinary floating point or integral types it should be sufficient simply to mark the variable as const at global scope:

const double common = 1.0;

It should then be usable in any subsequent function, whether host, __host__, __device__, or __global__.

This is supported in the documentation here, subject to various restrictions:

Let 'V' denote a namespace scope variable or a class static member variable that has const qualified type and does not have execution space annotations (e.g., __device__, __constant__, __shared__). V is considered to be a host code variable.

The value of V may be directly used in device code, if V has been initialized with a constant expression before the point of use, and it has one of the following types:

  • builtin floating point type except when the Microsoft compiler is used as the host compiler,
  • builtin integral type.

Device source code cannot contain a reference to V or take the address of V.

In other cases, some possible options are:

  1. Use a compiler macro defined constant:

    #define COMMON 1.0
    
  2. Use templating, if the range of choices on the variable is discrete and limited.

  3. For other options/cases, it may be necessary to manage explicit host and device copies of the variable, e.g. using __constant__ memory on the device, and a corresponding copy on the host. Host and device paths within the __host__ __device__ function that accesses the variable could then differentiate behavior based on a nvcc compiler macro (e.g. #ifdef __CUDA_ARCH__ ...

Upvotes: 9

Related Questions