Reputation: 159
I am trying to copy the data in a cv::cuda::GpuMat
to a uint8_t*
variable which is to be used in a kernel.
The GpuMat contains an image data of resolution 752x480 and of type CV_8UC1. Below is the sample code:
uint8_t *imgPtr;
cv::Mat left, downloadedLeft;
cv::cuda::GpuMat gpuLeft;
left = imread("leftview.jpg", cv::IMREAD_GRAYSCALE);
gpuLeft.upload(left);
cudaMalloc((void **)&imgPtr, sizeof(uint8_t)*gpuLeft.rows*gpuLeft.cols);
cudaMemcpyAsync(imgPtr, gpuLeft.ptr<uint8_t>(), sizeof(uint8_t)*gpuLeft.rows*gpuLeft.cols, cudaMemcpyDeviceToDevice);
// following code is just for testing and visualization...
cv::cuda::GpuMat gpuImg(left.rows, left.cols, left.type(), imgPtr);
gpuImg.download(downloadedLeft);
imshow ("test", downloadedLeft);
waitKey(0);
But the output is not as expected. Following are the input and output image respectively.
I have tried giving the cv::Mat
source to the cudaMemcpy
. It seems to be working fine. The issue seems to be with the cv::cuda::GpuMat
and cudaMemcpy
. A similar issue is discussed in the here
Also, if the image with is 256 or 512, the program seems to be working fine.
What is that I am missing? What should be done for the 752x480 image to work properly?
Upvotes: 2
Views: 9525
Reputation: 72349
OpenCV GpuMat uses strided storage (so the image is not stored contiguously in memory). In short, your example fails for most cases because
By my reading of the documentation, you probably want something like this:
uint8_t *imgPtr;
cv::Mat left, downloadedLeft;
cv::cuda::GpuMat gpuLeft;
left = imread("leftview.jpg", cv::IMREAD_GRAYSCALE);
gpuLeft.upload(left);
cudaMalloc((void **)&imgPtr, gpuLeft.rows*gpuLeft.step);
cudaMemcpyAsync(imgPtr, gpuLeft.ptr<uint8_t>(), gpuLeft.rows*gpuLeft.step, cudaMemcpyDeviceToDevice);
// following code is just for testing and visualization...
cv::cuda::GpuMat gpuImg(left.rows, left.cols, left.type(), imgPtr, gpuLeft.step);
gpuImg.download(downloadedLeft);
imshow ("test", downloadedLeft);
waitKey(0);
[Written by someone who has never used OpenCV, not compiled or tested, use at own risk]
The only time your code would work correctly would be when the row pitch of the GpuMat was serendipitously the same as the number of columns times the size of the type stored in the matrix. This is likely to be images with sizes which are round powers of two.
Upvotes: 5