Cuda memory allocation

Question

I am working with jetson TX2. I capture images from camera, as unsigned char *image.

Then, I need to do some image processing. For that, I use the GPU. With the jetson TX2, we can avoid the transfer of data host/device and device/host because the RAM is shared between the GPU and the CPU. For that, I use :

int height = 6004 ;
int width = 7920 ;
int NumElement = height*width ;
unsigned char *img1 ;
cudaMallocManaged(&img1, NumElement*sizeof(unsigned char));

Using that method, there is no limitation with the PCI. My problem is how assign the image from the buffer, to img1. This method works, but it is too long :

for(int i =0 ; i



I loose the advantage of the GPU using naive for loop ... And I if just use that method :

img = buffer


Like that, I have a problem when I enter in the kernel .

Cuda memory allocation

Answers (1)

Related Questions