Spiros
Spiros

Reputation: 2346

Usage of same constant memory array on different source files

I have a __constant__ memory array holding information that is needed by many kernels, which are placed in different source files. This constant memory array is defined in the header GlobalParameters.h, which is #included by all files containing kernels that need to access to this array.

I just discovered (look at talonmies' answer) that __constant memory__ is only available in the translation unit where it is defined, unless you turn on separate compilation (with CUDA 5.0 or later).

I still do not get completely what this means for my case.

Assuming that I cannot turn on separate compilation, is there a way for dealing with my needs? Where should I place the definition of my constant memory array? What if I place it in my header, which is #included in many translation units?

Assuming I can turn on separate compilation, should I declare my __constant__ memory array in the header as extern and place the definition inside a source file (e.g. GlobalParameters.cu)?

Upvotes: 4

Views: 2390

Answers (4)

MatrixR
MatrixR

Reputation: 1

Yes. The later CUDA doc says: When compiling in the separate compilation mode (see the nvcc user manual for a description of this mode), device, shared, managed and constant variables can be defined as external using the extern keyword. nvlink will generate an error when it cannot find a definition for an external variable (unless it is a dynamically allocated shared variable).

Upvotes: 0

Vitality
Vitality

Reputation: 21475

The below example assumes the possibility of using separate compilation. In this case, the below example shows how using extern to work with constant memory across different compilation units.

FILE kernel.cu

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#include <stdio.h>

#include "Utilities.cuh"

__constant__ int    N_GPU;
__constant__ float  a_GPU;

__global__ void printKernel();

int main()
{
    const int       N = 5;

    const float     a = 10.466;

    gpuErrchk(cudaMemcpyToSymbol(N_GPU, &N, sizeof(int)));
    gpuErrchk(cudaMemcpyToSymbol(a_GPU, &a, sizeof(float)));

    printKernel << <1, 1 >> > ();
    gpuErrchk(cudaPeekAtLastError());
    gpuErrchk(cudaDeviceSynchronize());

    return 0;
}

FILE otherCompilationUnit.cu

#include <stdio.h>

extern __constant__ int     N_GPU;
extern __constant__ float   a_GPU;

__global__ void printKernel() {

    printf("N = %i; a = %f\n", N_GPU, a_GPU);

}

Upvotes: 2

ArchaeaSoftware
ArchaeaSoftware

Reputation: 4422

One way to make constant memory available to translation units other than the one where it is declared, is to call cudaGetSymbolAddress() and make the pointer available to the other functions.

This technique is playing with fire to some degree, because if you use the pointer to write to the memory without appropriate barriers and synchronization, you may run afoul of the lack of coherency between constant memory and global memory.

Also, you may not get the full performance benefits of constant memory if you use this method. That should be less true on SM 2.x and later hardware - disassemble the object code and make sure the compiler is emitting "load uniform" instructions.

Upvotes: 3

hubs
hubs

Reputation: 1809

No, without using separate compilation it won't be possible to use the same constant memory, that is declared once, over several .cu files.

In my oppinion there are two ways for a workaround.
First one is to implement all kernels within one .cu file. Therefore you will get the disadvantage that this file will become very large with a bad overview.

A second way would be to declare in every .cu file the constant memory again. Then once with a wrapper copy the values into the constant memory for every single .cu file - like I described in an answer here. Disadvantages would be that you have to ensure that you don't forget to copy the values in single .cu files and you have to check that you won't run in the limitation of total available constant memory.

Upvotes: 0

Related Questions