Reputation: 907
I'm trying to make a sense out of a strange behavior happening in my OpenCL kernel function. I'm basically trying to convert a string that contains hexadecimals, into a string that contains its decimal representation but, for reasons I cannot fathom, running the same kernel using GPU and CPU yield different results.
The kernel looks like the following:
// yes, there's no result defined for the moment
__kernel void testKernel(__global uint message_length, __global char *message) {
size_t converted_message_length = message_length / 2;
char converted_message[converted_message_length];
// (1) hex to decimal conversion
for (size_t idx = 0, j = 0; idx < converted_message_length; idx++, j++) {
converted_message[idx] = (message[j] & '@' ? message[j] + 9 : message[j]) << 4;
j++;
converted_message[idx] |= (message[j] & '@' ? message[j] + 9 : message[j]) & 0xF;
printf("converted '%c%c' into '%i'\n", message[j - 1], message[j], converted_message[idx]);
}
// (2) this should be redundant, since I already print the content...
// but actually behaves differently with different device (CPU/GPU)
for (size_t idx = 0, j = 0; idx < converted_message_length; idx++, j++) {
printf("converted_message[%i]: '%i'\n", idx, converted_message[idx]);
}
Now, if I pass as arguments of testKernel
function the lenght 4
, and input string containing hexadecimal value 3e2b
, I would expect them to be converted into decimals 62
and 43
(see this table for hex -> decimal conversion).
And, If I run the kernel using my CPU (Intel(R) Core(TM) i9-9880H), indeed I can see the following conversion happening:
converted '3e' into '62'
converted '2b' into '43'
converted_message[0]: '62'
converted_message[1]: '43'
However, if I run this very same kernel using my GPU (AMD Radeon Pro 5500M), I see the following results:
converted '3e' into '62'
converted '2b' into '43'
converted_message[0]: '0' <-- why it is 0 ???
converted_message[1]: '0' <-- why it is 0 ???
It seems like converted_message
is successfully written inside loop (1)
, but then its values get lost when I enter inside (2)
loop. How's that even possible? Is it the OpenCL performing some weird optimization under the hood, that manifest itself only when running in the GPU ?
Upvotes: 1
Views: 271
Reputation: 3698
char converted_message[converted_message_length];
This is a variable length array which is not supported in standard OpenCL. It might work on some OpenCL implementations as an extension, but is not portable.
Specify a fixed size for the array or allocate a buffer on the host.
Upvotes: 1