scatalfamo
scatalfamo

Reputation: 35

Memory Object Allocation failure using c# and opencl

I am writing an image processing program with the express purpose to alter large images, the one I'm working with is 8165 pixels by 4915 pixels. I was told to implement gpu processing, so after some research I decided to go with OpenCL. I started implementing the OpenCL C# wrapper OpenCLTemplate.

My code takes in a bitmap and uses lockbits to lock its memory location. I then copy the order of each bit into an array, run the array through the openCL kernel, and it inverts each bit in the array. I then run the inverted bits back into the memory location of the image. I split this process into ten chunks so that i can increment a progress bar.

My code works perfectly with smaller images, but when I try to run it with my big image I keep getting a MemObjectAllocationFailure when trying to execute the kernel. I don't know why its doing this and i would appreciate any help in figuring out why or how to fix it.

    using OpenCLTemplate;

    public static void Invert(Bitmap image, ToolStripProgressBar progressBar)
    {
        string openCLInvert = @"
        __kernel void Filter(__global uchar *  Img0,
                             __global float *  ImgF)

        {
            // Gets information about work-item
            int x = get_global_id(0);
            int y = get_global_id(1);

            // Gets information about work size
            int width = get_global_size(0);
            int height = get_global_size(1);

            int ind = 4 * (x + width * y );

            // Inverts image colors
            ImgF[ind]= 255.0f - (float)Img0[ind];
            ImgF[1 + ind]= 255.0f - (float)Img0[1 + ind];
            ImgF[2 + ind]= 255.0f - (float)Img0[2 + ind];

            // Leave alpha component equal
            ImgF[ind + 3] = (float)Img0[ind + 3];
        }";

        //Lock the image in memory and get image lock data
        var imageData = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);

        CLCalc.InitCL();

        for (int i = 0; i < 10; i++)
        {
            unsafe
            {
                int adjustedHeight = (((i + 1) * imageData.Height) / 10) - ((i * imageData.Height) / 10);
                int count = 0;

                byte[] Data = new byte[(4 * imageData.Stride * adjustedHeight)];
                var startPointer = (byte*)imageData.Scan0;

                for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
                {
                    for (int x = 0; x < imageData.Width; x++)
                    {
                        byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));

                        Data[count] = *Byte;
                        Data[count + 1] = *(Byte + 1);
                        Data[count + 2] = *(Byte + 2);
                        Data[count + 3] = *(Byte + 3);
                        count += 4;
                    }
                }

                CLCalc.Program.Compile(openCLInvert);
                CLCalc.Program.Kernel kernel = new CLCalc.Program.Kernel("Filter");
                CLCalc.Program.Variable CLData = new CLCalc.Program.Variable(Data);

                float[] imgProcessed = new float[Data.Length];

                CLCalc.Program.Variable CLFiltered = new CLCalc.Program.Variable(imgProcessed);
                CLCalc.Program.Variable[] args = new CLCalc.Program.Variable[] { CLData, CLFiltered };

                kernel.Execute(args, new int[] { imageData.Width, adjustedHeight });
                CLCalc.Program.Sync();

                CLFiltered.ReadFromDeviceTo(imgProcessed);

                count = 0;

                for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
                {
                    for (int x = 0; x < imageData.Width; x++)
                    {
                        byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));

                        *Byte = (byte)imgProcessed[count];
                        *(Byte + 1) = (byte)imgProcessed[count + 1];
                        *(Byte + 2) = (byte)imgProcessed[count + 2];
                        *(Byte + 3) = (byte)imgProcessed[count + 3];
                        count += 4;
                    }
                }
            }
            progressBar.Owner.Invoke((Action)progressBar.PerformStep);
        }

        //Unlock image
        image.UnlockBits(imageData);
    }

Upvotes: 3

Views: 3158

Answers (1)

Eric Bainville
Eric Bainville

Reputation: 9906

You may have reached a memory allocation limit of your OpenCL driver/device. Check the values returned by clGetDeviceInfo. There is a limit for the size of one single memory object. The OpenCL driver may allow the total size of all allocated memory objects to exceed the memory size on your device, and will copy them to/from host memory when needed.

To process large images, you may have to split them into smaller pieces, and process them separately.

Upvotes: 1

Related Questions