user13162349
user13162349

Reputation:

Cuda memory allocation

I am working with jetson TX2. I capture images from camera, as unsigned char *image.

Then, I need to do some image processing. For that, I use the GPU. With the jetson TX2, we can avoid the transfer of data host/device and device/host because the RAM is shared between the GPU and the CPU. For that, I use :

int height = 6004 ;
int width = 7920 ;
int NumElement = height*width ;
unsigned char *img1 ;
cudaMallocManaged(&img1, NumElement*sizeof(unsigned char));

Using that method, there is no limitation with the PCI. My problem is how assign the image from the buffer, to img1. This method works, but it is too long :

for(int i =0 ; i<NumElement ; i++)
    img[i] = buffer[i] ;

I loose the advantage of the GPU using naive for loop ... And I if just use that method :

img = buffer

Like that, I have a problem when I enter in the kernel .

Upvotes: 0

Views: 211

Answers (1)

talonmies
talonmies

Reputation: 72344

Use cudaMemcpy with cudaMemcpyDefault, something like

cudaMemcpy(&buffer[0], &img[0], NumElement * sizeof(char), cudaMemcpyDefault);

You could also potentially use memcpy

Upvotes: 1

Related Questions