Reputation: 453
I have an array on device of huge length and for some condition check I want to access (On Host/ CPU) only one element from middle (say Nth element). What could be the optimized way for doing this.
Do I need to write a kernel that writes Nth location in single element array from the src array and then I copy single element array to host?
Upvotes: 0
Views: 1066
Reputation: 29
One addendum to answer 1, you may need to take account of the bytes per element of your array. e.g. For an array of arrays of various types on the device:
#ifdef CUDA_KERNEL
char* mgpu[ MAX_BUF ]; // Device array of pointers to arrays of various types.
#else
CUdeviceptr mgpu[ MAX_BUF ]; // on host, gpu is a device pointer.
CUdeviceptr gpu (int n ) { return mgpu[n]; }
CUdeviceptr GPUpointer = m_Fluid.gpu(FGRIDOFF); // Device pointer to FGRIDOFF (int) array
cuMemcpyDtoH (&CPUelement, GPUpointer+(offset*sizeof(int)) , sizeof(int) );
Upvotes: 0
Reputation: 1052
You can copy single element of an array using cudaMemcpy
.
Let's say you want to copy N
-th element of array:
int * dSourceArray
to variable
int hTargetVariable
You can apply device pointer arithmetics on the host. All you need to do is to move dSourceArray
pointer by N
elements ant copy single element:
cudaMemcpy(&hTargetVariable, dSourceArray+N, sizeof(int), cudaMemcpyDeviceToHost)
Keep in mind that if you use multiple streams you would like to synchronize the device before transferring the data.
Upvotes: 2