Reputation: 27
I changed my method to allocate host memory from method 1 to method 2 as shown in my code below. The code can compile and run without any error. I just wonder is it a proper way or any side effect to allocate memory for pointer to pointer using method 2.
#define TESTSIZE 10
#define DIGITSIZE 5
//Method 1
int **ra;
ra = (int**)malloc(TESTSIZE * sizeof(int));
for(int i = 0; i < TESTSIZE; i++){
ra[i] = (int *)malloc(DIGITSIZE * sizeof(int));
}
//Method 2
int **ra;
cudaMallocHost((void**)&ra, TESTSIZE * sizeof(int));
for(int i = 0; i < TESTSIZE; i++){
cudaMallocHost((void**)&ra[i], DIGITSIZE * sizeof(int));
}
Upvotes: 0
Views: 401
Reputation: 2683
Both of them work fine. Yet, there are differences between cudaMallocHost
and malloc
. The reasons is that cudaMallocHost
allocates pinned memory so under the hood the OS's doing something similar to malloc
and some extra functions to pin the pages. This means that cudaMallocHost
generally takes longer.
That being said, if you repeatedly want to cudaMemcpy
from a single buffer then cudaMallocHost
may benefit in the long run since it's quicker to transfer data from pinned memory.
Also, you are required to use pinned memory to overlap data transfer/computations with streams
.
Upvotes: 2