Alex
Alex

Reputation: 1031

Initializing cuda global variable

   __constant__ const unsigned int *ff = (const unsigned int[]){90, 50, 100};


int main()
{
}

Compiling:

nvcc ./test.cu
./test.cu(1): error: identifier "__T20" is undefined in device code

1 error detected in the compilation of "/tmp/tmpxft_0000785f_00000000-10_test.cpp2.i".

Verbose compiling:

 nvcc --verbose ./test.cu
    #$ _SPACE_= 
    #$ _CUDART_=cudart
    #$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin
    #$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin
    #$ _TARGET_SIZE_=
    #$ _TARGET_DIR_=
    #$ _TARGET_SIZE_=64
    #$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice
    #$ PATH=/usr/lib/nvidia-cuda-toolkit/bin:/home/kasha/bin:/home/kasha/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
    #$ LIBRARIES=  -L/usr/lib/x86_64-linux-gnu/stubs
    #$ gcc -D__CUDA_ARCH__=200 -E -x c++        -DCUDA_DOUBLE_MATH_FUNCTIONS  -D__CUDACC__ -D__NVCC__  -D"__CUDACC_VER__=70517" -D"__CUDACC_VER_BUILD__=17" -D"__CUDACC_VER_MINOR__=5" -D"__CUDACC_VER_MAJOR__=7" -include "cuda_runtime.h" -m64 "./test.cu" > "/tmp/tmpxft_0000799b_00000000-9_test.cpp1.ii" 
    #$ cudafe --allow_managed --m64 --gnu_version=50400 -tused --no_remove_unneeded_entities --gen_c_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.c" --stub_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.gpu" --nv_arch "compute_20" --gen_module_id_file --module_id_file_name "/tmp/tmpxft_0000799b_00000000-3_test.module_id" --include_file_name "tmpxft_0000799b_00000000-2_test.fatbin.c" "/tmp/tmpxft_0000799b_00000000-9_test.cpp1.ii" 
    #$ gcc -D__CUDA_ARCH__=200 -E -x c        -DCUDA_DOUBLE_MATH_FUNCTIONS  -D__CUDACC__ -D__NVCC__ -D__CUDANVVM__  -D__CUDA_PREC_DIV -D__CUDA_PREC_SQRT -m64 "/tmp/tmpxft_0000799b_00000000-4_test.cudafe1.gpu" > "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i" 
    #$ cudafe -w --allow_managed --m64 --gnu_version=50400 --c --gen_c_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.c" --stub_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.stub.c" --gen_device_file_name "/tmp/tmpxft_0000799b_00000000-11_test.cudafe2.gpu" --nv_arch "compute_20" --module_id_file_name "/tmp/tmpxft_0000799b_00000000-3_test.module_id" --include_file_name "tmpxft_0000799b_00000000-2_test.fatbin.c" "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i" 
    ./test.cu(1): error: identifier "__T20" is undefined in device code

    1 error detected in the compilation of "/tmp/tmpxft_0000799b_00000000-10_test.cpp2.i".
    # --error 0x2 --

During compilation cuda assign array (const unsigned int[]){90, 50, 100} to __T20 variable and declare it as static. Thus its unaccessible from the main file. In the main file there is: __constant__ const unsigned *ff = __T20; How to initialize global pointer with array in cuda?

Upvotes: 0

Views: 2332

Answers (1)

talonmies
talonmies

Reputation: 72353

The compiler is telling you exactly what the error is. When you do this:

__constant__ const unsigned int *ff = (const unsigned int[]){90, 50, 100};

you are trying to statically assign the address of an anonymous host array to a device symbol. Obviously that makes no sense; even if it would compile, the address assigned to ff would be invalid because it would be in host memory.

To the best of my knowledge, there is no way of declaring and using anonymous objects in device memory in initialisation of statically declared global device symbols.

You can do something like this:

__device__ const unsigned int ee[3] = {90, 50, 100};
__constant__ const unsigned int *ff = &ee[0];


int main()
{
}

so that the static assignment made with an address which the compiler explicitly can identify as being in device memory. Note that the expected caching properties of constant memory only apply to the pointer value and not the memory it points to, so the use case for what you are trying to do is pretty limited, I would have thought.

Upvotes: 2

Related Questions