CUDA: Copying device data to 2D host array

Question

I have a HostMatrix which was declared as:

float **HostMatrix

I have to copy the content of device matrix , pointed to by devicePointer to the 2 dimensional host matrix HostMatrix

I tried this

for (int i=0; i



But this will be wrong since I am doing this inside a host function, and devicePointer can not be manipulated directly in host function as I am doing in last line. 

So what will be the correct way to achieve this ? 

Edit

Oh actually this will work correctly!. But the problem would come while de-allocating the memory as discussed in my earlier question:  CUDA: Invalid Device Pointer error when reallocating memory . So basically the following will be incorrect 

 for (int i=0; i

Shawn · Accepted Answer

You basically need to first allocate devicePointer with all the required memory. But then, increasing it all the time is maybe not the easiest idea, since then the free at the end will be broken. Say you have nRows rows of size nCols. Then this should work properly (I didn't try though, but the idea should be ok):

float* dPtr;
cudaMalloc(&dPtr, nRows * nCols);
for (int i=0; i< nRows; i++){
    cudaMemcpy(HostMatrix[i], dPtr + i * nCols, nCols * sizeof(float), cudaMemcpyDeviceToHost);
}
// do whatever you want
cudaFree(dPtr);

The issue is that if you keep increasing dPtr, the cudaFree at the end will only be on the "last row" so it's wrong.

Does it make sense?

CUDA: Copying device data to 2D host array

Answers (1)

Related Questions