Abid Rahman K
Abid Rahman K

Reputation: 52646

CUDA - convert RGB image to Grayscale

I am starting to learn CUDA GPU programming from Udacity video course (course is 2 yrs old). I am using CUDA 5.5 with Visual Studio Express 2012 (students edition, so not all features of CUDA debugging is not available) on Nvidia GeForce GT 630M GPU.

Just implemented some vector addition and other simple operations.

Now I am trying to convert a RGB image to Grayscale. I am reading image with help of OpenCV. (Anyway I failed whatever methods I tried. That is why I am here)

Below is my .cpp file : https://gist.github.com/abidrahmank/7020863

Below is my .cu file : https://gist.github.com/abidrahmank/7020910

My input image is a simple 64x64 color image (Actually I used 512x512 image first, didn't work, so brought down to 64x64 to check if that is the problem. It doesn't seem so)

Problem

My output image of CUDA implementation is a white image. All value 255. Somewhere here and there, there are some gray pixels, may be less than 1%. Remaining everything is white.

What I tried:

For three days, I tried following things:

  1. I thought problem may be due image size, so that number of threads may not be optimal or something like that, So reduced image size. Still same result.
  2. I tried a similar example, created a 64x64 array. Take its two pixels at a time, and find the square of their sums, and it worked fine. Here is the code : https://gist.github.com/abidrahmank/7021023
  3. Started checking data one-by-one at each stage. Input image just before loading to GPU is fine. But input data, when I checked inside kernel, is always 255. (Check line 14 here)
  4. Finally I set all GPU data to zero using CudaMemset and checked input data inside kernel, it is still 255.

So I don't have any other option to do other asking at StackOverflow.

Can anyone tell me what is the mistake I am making?

Upvotes: 1

Views: 11025

Answers (2)

kangshiyin
kangshiyin

Reputation: 9781

Besides the parameter issue indicated by DanielKO, you also have problems on thread/block settings.

Since you've already treat your 2-D image as a 1-D array, here's a good example showing how to set thread/block for data with arbitrary size.

https://developer.nvidia.com/content/easy-introduction-cuda-c-and-c

Upvotes: 1

DanielKO
DanielKO

Reputation: 4517

Your kernel signature says:

__global__ void kernel(unsigned char* d_in, unsigned char* d_out)

But you call it like:

kernel<<<rows,cols>>>(d_out, d_in);

Which one is in and which one is out?

Having done quite a bit of CUDA programming in the past, I would strongly recommend that you use Thrust instead of hand-crafting kernels. Even thrust::for_each is hard to beat with raw kernels.

Upvotes: 2

Related Questions