user350617
user350617

Reputation:

Weird values when passing an array of structs as an openCL kernel argument

When passing an array of structs to my kernel as an argument, I get weird values for the items after the first (array[1], array[2], etc). It seems to be an alignment issue maybe?

Here is the struct:

typedef struct Sphere
{
    float3 color;
    float3 position;
    float3 reflectivity;
    float radius;
    int phong;
    bool isReflective;
} Sphere;

Here is the host side init code:

cl::Buffer cl_spheres = cl::Buffer(context, CL_MEM_READ_ONLY, sizeof(Sphere) * MAX_SPHERES, NULL, &err);
err = queue.enqueueWriteBuffer(cl_spheres, CL_TRUE, 0, sizeof(Sphere) * MAX_SPHERES, spheres, NULL, &event);
err = kernel.setArg(3, cl_spheres);

What happens is that the color for the second Sphere struct in the array will actually have the last value of what I set color to on the host side (s3 or z), a non initialized zero value, and the first value of what I set position to on the host side (s0 or x). I noticed that the float3 datatype actually still has a fourth value (s3) that does not get initialized. I think that is where the non initialized zero value is coming from. So it seems that it is an alignment issue. I really am at a loss as to what I could do to fix it. I was hoping maybe someone could shed some light on this problem. I have ensured that my struct definitions are exactly the same on both sides.

Upvotes: 1

Views: 1336

Answers (1)

Eric Bainville
Eric Bainville

Reputation: 9886

From the OpenCL 1.2 specs, section 6.11.1:

Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question and must also be a power of two.

Also cl_float3 counts as a cl_float4, see section 6.1.5.

Finally, in section 6.9.k:

Arguments to kernel functions in a program cannot be declared with the built-in scalar types bool, half, size_t, ptrdiff_t, intptr_t, and uintptr_t or a struct and/or union that contain fields declared to be one of these built-in scalar types.

To comply with these rules, and probably make accesses faster, you can try (OpenCL C side; on the host use cl_float4):

typedef struct Sphere
{
    float4 color;
    float4 position;
    float4 reflectivity;
    float4 radiusPhongReflective; // each value uses 1 float
} Sphere;

Upvotes: 1

Related Questions