Passing class with pointer to array from C++ to CUDA

Question

I have the following class in C++:

template
class dynArray {

 public:
    T *elements;
    int size;
    int capacity;
    int initCapacity;
}

Is there any way to copy an object of this class to use in a CUDA kernel using cudaMemcpy() without having to copy its content element by element?

Thanks in advance.

Mehrwolf · Accepted Answer

First thoughts

To me it seems that you want to have something like std::vector<> on the GPU. I would give the advice to really think about, if you only need the data in the GPU global memory or also the size of the vector. IMHO, the code on the GPU should really only modify the data of the array but do not resize the array itself. This is something that should be done on the host.

There is an open-source library called AGILE, which implements a GPUVector which is basically something like std::vector<> on the GPU. The GPUVector stores the capacity, the size and a pointer to the GPU memory. A kernel which operates on a GPUVector gets the pointer to the memory area and the size as arguments, i.e. the kernel calls look something like this:

GPUVector v;
[... initialize v...]
computationKernel<<>>(v.data(), v.size());

Translating this to your class, GPUVector::data() would just return dynArray::elements (which points to GPU memory) and GPUVector::size() returns dynArray::size. The dynArray::size should stay on the CPU side because you most likely do not want to modify it from GPU code (for example because you cannot call cudaMalloc from the GPU). If you don't modify it, you can as well pass it as a parameter.

Another libray you might want to look at is Thrust, which also provides an STL-like vector on the GPU.

A copy method for dynArray

As it is still desired to copy the whole array, I would suggest the following approach:

template
class dynArray 
{
  public:
    //! Copies this dynArray to the GPU and returns a pointer to the copy.
    void* copyToDevice()
    {
        // Copy the dynArray to the device.
        void* deviceArray;
        cudaMalloc(&deviceArray, sizeof(dynArray));
        cudaMemcpy(deviceArray, this, sizeof(dynArray), 
                   cudaMemcpyHostToDevice);
    
        // Copy the elements array to the device.
        void* deviceElements;
        cudaMalloc(&deviceElements, sizeof(T) * capacity);
        cudaMemcpy(deviceElements, elements, sizeof(T) * capacity, 
                   cudaMemcpyHostToDevice);
    
        // On the device, the elements pointer has to point to deviceElements.
        cudaMemcpy(deviceArray, deviceElements, sizeof(T*),
                   cudaMemcpyHostToDevice);

        return deviceArray;
    }
    
    T *elements;
    int size;
    int capacity;
    int initCapacity;
}

Passing class with pointer to array from C++ to CUDA

Answers (2)

First thoughts

A copy method for dynArray

Related Questions