Efficiently multiply OpenCL vector components?

Question

I have a float8 vector type that I multiply the components of the vector using vector component addressing as follows ( note the variable v below isn't a constant in reality);

float8 v = (float8) (1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f);
float result = v.s0 * v.s1 * v.s2 * v.s3 * v.s4 * v.s5 * v.s6 * v.s7;

However this prevents my kernel from being vectorised when being compiled with Intel Code builder.

Device build started
Device build done
Kernel  was not vectorized

To over come this I started to create copies of the vector, masking the required components and multiplying them all together before trying to call the dot function however this all seemed rather inefficient and convoluted.

My question is therefore how can I multiply the components of my vector in a efficient vectorised manor?

Efficiently multiply OpenCL vector components?

Answers (1)

Related Questions