user11733000
user11733000

Reputation:

Irregular behaviour of vectors in OpenCL(1.2) kernels

So, I am trying to perform some operation inside an OpenCL kernel. I have this buffer named filter which is a 3x3 matrix initialized with value 1.

I pass this as an argument to the OpenCL kernel from the host side. The issue is when I try to fetch this buffer on the device side as a float3 vector. For ex -

__kernel void(constant float3* restrict filter)
{
        float3 temp1 = filter[0];
        float3 temp2 = filter[1];
        float3 temp3 = filter[2];
}

The first two temp variables behave as expected and have all their value as 1. But, the third temp variable (temp3) has only the x component as 1 and rest of the y and z components are 0. When I fetch the buffer as only a float vector, everything behaves as expected. Am I doing something wrong? I don't want to use vload instructions as they give an overhead.

Upvotes: 1

Views: 77

Answers (1)

Jan-Gerd
Jan-Gerd

Reputation: 1289

In OpenCL, float3 is just an alias for float4, so your 9 values will fill the x, y, z, and w component of temp1 and temp2, which leaves just one value for temp3.x. You will probably need to use the vload3 instruction.

See section 6.1.5. Alignment of Types of the OpenCL specification for more information:

For 3-component vector data types, the size of the data type is 4 * sizeof(component). This means that a 3-component vector data type will be aligned to a 4 * sizeof(component) boundary. The vload3 and vstore3 built-in functions can be used to read and write, respectively, 3-component vector data types from an array of packed scalar data type.

Upvotes: 2

Related Questions