Reputation: 39
Is it possible to copy a 2D host array allocated like that
h_A=(int**)malloc(N*sizeof(int*));
for(i=0;i<N;i++)
{
h_A[i]=(int*)malloc(3*sizeof(int));
}
to 2D device array allocated like that
cudaMallocPitch((void**)&d_A, &pitch, 3*sizeof(int), N);
I've tried to copy from host to device and back to host to check if the process worked and the result was that only the 2 first rows copied correctly
https://drive.google.com/file/d/1gXpChyYd2Div0pDjTRxZhwYd7GHRfjXN/view?usp=sharing
Copy from host array h_A to device array d_A
cudaMemcpy2D(d_A, pitch, h_A, 3*sizeof(int), 3*sizeof(int), N, cudaMemcpyHostToDevice);
Copy from device array d_A to host array d_B
cudaMemcpy2D(h_B, pitch, d_A, 3*sizeof(int), 3*sizeof(int), N, cudaMemcpyDeviceToHost);
Upvotes: 0
Views: 151
Reputation: 72350
If you allocate an array of pointers to store rows, like this:
h_A=(int**)malloc(N*sizeof(int*));
for(i=0;i<N;i++)
{
h_A[i]=(int*)malloc(3*sizeof(int));
}
then to allocate and move that to a comparable device side structure using conventional device memory requires this:
dh_A=(int**)malloc(N*sizeof(int*));
for(i=0;i<N;i++)
{
int* p;
cudaMalloc(&p, 3*sizeof(int))
cudaMemcpy(p, h_A[i], 3*sizeof(int), cudaMemcpyHostToDevice);
dh_A[i]=p;
}
int** d_A = cudaMalloc(&d_A, sizeof(int*) * N);
cudaMemcpy(d_A, dh_A, N*sizeof(int*), cudaMemcpyHostToDevice);
[Note: all code written in browser, not guaranteed to compile or work correctly]
I will leave it as an exercise to the reader how to perform the device to host copy. At this point you might conclude that it is simpler to just use linear memory on both the host and device. It will be simpler and faster.
Upvotes: 1