Accessing same device memory from multiple cuda files

Question

I want to design a code in which the same device memory should be accessed from kernels in multiple cuda files. A simplified example is given below in which main.c calls 3 .cu files: cuda_malloc.cu, cuda_print.cu and cuda_free.cu.

Main.c file: declares a pointer "d_array"

main()
{
int maxpar = 10;

float* d_array;

cuda_malloc(maxpar, d_array);

cuda_print(maxpar,d_array);

cuda_free(d_array);
}

cuda_malloc.cu file: allocates device memory for d_array and sets values to zero.

extern "C" void cuda_malloc(int maxpar, float* d_array)
{
    CUDA_SAFE_CALL(cudaMalloc((void**)&d_array,sizeof(float)*maxpar));
    CUDA_SAFE_CALL(cudaMemset(d_array,'\0',sizeof(float)*maxpar));
}

cuda_print.cu file: calls "kernel" to print "d_array" from the device memory

extern "C"
{
__global__ void kernel(int maxpar, float* d_array)
{
    int tid = threadIdx.x;
    if (tid >= maxpar) return;
    printf("tId = %d, d_array[i] = %f 
",tid,d_array[tid]);
}

    void cuda_print(int maxpar, float* d_array)
{
    //If I un-comment the following 2 lines, the kernel function prints array values
    //otherwise, it does not
    //CUDA_SAFE_CALL(cudaMalloc((void**)&d_array,sizeof(float)*maxpar));
    //CUDA_SAFE_CALL(cudaMemset(d_array,'\0',sizeof(float)*maxpar));

    kernel <<<1, maxpar>>> (maxpar,d_array);

    cudaDeviceSynchronize();
    cudaGetLastError();
}

cuda_free.cu file: frees the device memory

extern "C" void cuda_free(float* d_array)
{
    CUDA_SAFE_CALL(cudaFree(d_array));
}

This code compiles fine. Notice that I am trying to print "d_array" in the "kernel" function called from the "cuda_print.cu" file. However, it does not print it. There is no error as well. If in "cuda-print.cu" file, I again allocate device memory and memset it to zero, then kernel prints it.

My question is: how can I access the same device memory from multiple cuda files?

Thanks

stuhlo · Accepted Answer

Your problem is in the function void cuda_malloc(int maxpar, float* d_array). When you call:

CUDA_SAFE_CALL(cudaMalloc((void**)&d_array,sizeof(float)*maxpar));
CUDA_SAFE_CALL(cudaMemset(d_array,'\0',sizeof(float)*maxpar));

d_array is changed only 'locally'.

Instead of your approach your function should look like this:

extern "C" void cuda_malloc(int maxpar, float** d_array) {
    CUDA_SAFE_CALL(cudaMalloc((void**)d_array,sizeof(float)*maxpar));
    CUDA_SAFE_CALL(cudaMemset(*d_array,'\0',sizeof(float)*maxpar));    
}

and call it like this:

cuda_malloc(maxpar, &d_array);

Accessing same device memory from multiple cuda files

Answers (1)

Related Questions