3DExtended
3DExtended

Reputation: 151

OpenCL - Writing to the Buffer is zero?

I have written a kernel, which should be doing nothing, except from adding an one to each component of a float3:

__kernel void GetCellIndex(__global Particle* particles) {

   int globalID = get_global_id(0);
   particles[globalID].position.x += 1;
   particles[globalID].position.y += 1;
   particles[globalID].position.z += 1;
};

with following struct (in the kernel)

typedef struct _Particle
{
    cl_float3 position;
}Particle;

my problem is, that when i write my array of particles to the GPU, every component is zero. here is the neccassary code:

(Particle*) particles = new Particle[200];
for (int i = 0; i < 200; i++)
{
    particles[i].position.x = 5f;
}

cl_Particles = clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(Particle)*200, NULL, &err);
if (err != 0)
{
    std::cout << "CreateBuffer does not work!" << std::endl;
    system("Pause");
}

clEnqueueWriteBuffer(queue, cl_Particles, CL_TRUE, 0, sizeof(Particle) * 200, &particles, 0, NULL, NULL);


//init of kernel etc.



err = clSetKernelArg(kernel, 0, sizeof(Particle) * 200, &cl_Particles);
if (err != 0) {
    std::cout << "Error: setKernelArg 0 does not work!" << std::endl;
    system("Pause");
}

and this is my struct on the CPU:

typedef struct _Particle
{
    cl_float4 position;
}Particle;

can someone help me with this problem? (any clue is worth to discuss...)

Thanks

Upvotes: 1

Views: 1055

Answers (1)

Martin Zabel
Martin Zabel

Reputation: 3659

Your code snippet contains some typical C programming errors. At first,

(Particle*) particles = new Particle[200];

does not declare a new variable particle as a pointer to Particle. It must be:

Particle *particles = new Particle[200];

As next, in your call of

clEnqueueWriteBuffer(queue, cl_Particles, CL_TRUE, 0, sizeof(Particle) * 200, &particles, 0, NULL, NULL);

you passed a pointer to the particles pointer as the 6th parameter (ptr). But, here you must pass a pointer to the region on the host containing the data. Thus, change &particles to particles:

clEnqueueWriteBuffer(queue, cl_Particles, CL_TRUE, 0, sizeof(Particle) * 200, particles, 0, NULL, NULL);

The setup of the kernel arguments is also wrong. Here, you must pass the OpenCL buffer created with clCreateBuffer. Thus, replace

err = clSetKernelArg(kernel, 0, sizeof(Particle) * 200, &cl_Particles);

with:

err = clSetKernelArg(kernel, 0, sizeof(cl_Particle), &cl_Particles);

As clCreateBuffer returns a value of type cl_mem, the expression sizeof(cl_Particle) evaluates to the same as sizeof(cl_mem). I recommend to always call sizeof() on the variable, so you need to change the data-type only in one place: the variable declaration.

On my platform, cl_float3 is the same as cl_float4. This might not be true on your/every platform, so you should always use the same type in the host code and in the kernel code. Also, in your kernel code you should/must use the type float4 instead of cl_float4.

I hope, I got the C calls right because I actually tested it with this C++ code. This code snippet contains the fixed C calls as comments:

Particle *particles = new Particle[200];
for (int i = 0; i < 200; i++)
{
    //particles[i].position.x = 5f;
    particles[i].position.s[0] = 0x5f; // due to VC++ compiler
}

//cl_mem cl_Particles = cl_createBuffer(context, CL_MEM_READ_WRITE, sizeof(Particle)*200, NULL, &err); // FIXED
cl::Buffer cl_Particles(context, CL_MEM_READ_WRITE, sizeof(Particle)*200, NULL, &err); 
checkErr(err, "Buffer::Buffer()");

//err = clEnqueueWriteBuffer(queue, cl_Particles, CL_TRUE, 0, sizeof(Particle) * 200, particles, 0, NULL, NULL); // FIXED
queue.enqueueWriteBuffer(cl_Particles, CL_TRUE, 0, sizeof(Particle) * 200, particles, NULL, NULL);
checkErr(err, "ComamndQueue::enqueueWriteBuffer()");

//init of kernel
cl::Kernel kernel(program, "GetCellIndex", &err);
checkErr(err, "Kernel::Kernel()");

//err = clSetKernelArg(kernel, 0, sizeof(cl_Particle), &cl_Particles); // FIXED
err = kernel.setArg(0, sizeof(cl_Particles), &cl_Particles);
checkErr(err, "Kernel::setArg()");

Upvotes: 1

Related Questions