Why is the OpenCL Kernel not using normal x y cooordinates with Image2D?

Question

TLDR;

For anyone arriving here whilst trying to figure out how to do gaussian blur or grayscale with OpenCL, the final working code is here. Note that in that repo I'm actually running the whole thing inside Docker with GPU access using Nvidia's Docker wrapper. You can look inside the 'Dockerfile' for the steps that need to be taken to get the code running, or just run it using Nvidia-Docker if you have that setup and are running on an Nvidia GPU.

Original Question:

Using the following kernel in an OpenCL image filter application I get the expected result, that is, a returned grayscale version of the input image:

const sampler_t sampler =   CLK_NORMALIZED_COORDS_FALSE |
                            CLK_ADDRESS_CLAMP_TO_EDGE |
                            CLK_FILTER_NEAREST;

__kernel void process(__read_only  image2d_t src,
                        __write_only image2d_t dst)
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    float4 color;

    color = read_imagef(src, sampler, (int2)(x, y));
    float gray = (color.x + color.y + color.z) / 3;
    write_imagef(dst, (int2)(x,y), (float4)(gray, gray, gray, 0));
}

So far, so good. I then tried to create a kernel that would just copy across the top and left border of the image:

const sampler_t sampler =   CLK_NORMALIZED_COORDS_FALSE |
                            CLK_ADDRESS_CLAMP_TO_EDGE |
                            CLK_FILTER_NEAREST;

__kernel void process(__read_only  image2d_t src,
                        __write_only image2d_t dst)
{
    int x = get_global_id(0);
    int y = get_global_id(1);

    float4 color;

    if (x < 10 || y < 10) 
    {
        color = read_imagef(src, sampler, (int2)(x, y));
        write_imagef(dst, (int2)(x,y), (float4)(color.x, color.y, color.z, 0));
    } 
    else 
    {
        write_imagef(dst, (int2)(x,y), (float4)(0,0,0,0));
    }
}

The returned image is not what I expected:

I'm loading the input image this way:

//  Load an image using the OpenCV library and create an OpenCL
//  image out of it
cl::Image2D LoadImage(cl::Context context, char *fileName, int &width, int &height)
{
    cv::Mat image = cv::imread(fileName, CV_LOAD_IMAGE_COLOR);
    cv::Mat imageRGBA;

    width = image.rows;
    height = image.cols;

    cv::cvtColor(image, imageRGBA, CV_RGB2RGBA);

    char *buffer = reinterpret_cast(imageRGBA.data);

    cl::Image2D clImage(context,
                            CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,
                            cl::ImageFormat(CL_RGBA, CL_UNORM_INT8),
                            width,
                            height,
                            0,
                            buffer);
    return clImage;
}

The output image:

cl::Image2D imageOutput(context,
            CL_MEM_WRITE_ONLY,
            cl::ImageFormat(CL_RGBA, CL_UNORM_INT8),
            width,
            height,
            0,
            NULL);

The Kernel:

cl::Program program(context, util::loadProgram("border.cl"), true); 
cl::make_kernel filter(program, "process");
cl::NDRange global(width, height);
filter(cl::EnqueueArgs(queue, global), clImageInput, imageOutput);

Then reading the image back:

cl::size_t<3> origin;
origin[0] = 0; origin[1] = 0, origin[2] = 0;
cl::size_t<3> region;
region[0] = width; region[1] = height; region[2] = 1;
float* oup = new float[width * height];

queue.enqueueReadImage(imageOutput, CL_TRUE, origin, region, 0, 0, oup);

cv::imwrite(filename_out, cv::Mat(width, height, CV_8UC4, oup));

Why is the image being processed the way it is? Only selecting pixels with a y coordinate less than 10 seems to work, but selecting pixels with an x coordinate less than 10 seems to stagger across the image.

if I write a test image using the following line in the kernel:

write_imagef(dst, (int2)(x,y), (float4)((float)x / 512.0f, 0, 0, 0));

I get the following image:

The first strange thing is that the blue channel is being set, not the red. I have no idea why as I am alway loading and saving the image in RGBA order. Secondly, the banding is very unusual, I'm not sure how to interpret this.

If I use the following line in the kernel:

write_imagef(dst, (int2)(x,y), (float4)(0, (float)y / 512.0f, 0, 0));

I get the following image:

This looks the way I would expect.

I can provide more code if necessary but using the grayscale kernel in the exact same harness works perfectly. As does another kernel not listed here which simply copies all the pixels across.

I'm running the code on and Nvidia Geforce 980M with OpenCL 1.2

LaserJesus · Accepted Answer

Ok, so the issue was the way I was reading height and width was backwards, i.e.

width = image.rows;
height = image.cols;

Should have been

height = image.rows;
width = image.cols;

With this corrected, the rest of the code can stay the same, except the last line where I save the image to disk, here the values need to be swapped again, i.e.

cv::imwrite(filename_out, cv::Mat(width, height, CV_8UC4, oup));

Needs to change to:

cv::imwrite(filename_out, cv::Mat(height, width, CV_8UC4, oup));

I think this ultimately comes down to the matrix approach to an image where the first coordinate is actually the row number, which is the height and the second coordinate is the column number, which is the width.

The diagnostics @Dithermaster mentioned really helped, as did printing out the assumed width and height, which was ultimately incorrect.

It's interesting that by having both of those errors in the code a pixel for pixel copy worked fine, but once you start to perform actions based on the x,y coordinates you get some really funky results.

Why is the OpenCL Kernel not using normal x y cooordinates with Image2D?

TLDR;

Original Question:

Answers (2)

Related Questions