Reputation: 3801
I am doing a very simple vector addition kernel in OpenACC. And I am wondering whether this is an issue with the compiler I am using (accULL with OpenCL), as I am having issue it seems copying data back to the host from the device. All the results are correct BUT result[0]. E.g. the following code:
for (i=0; i<VEC_SIZE; i++) {
a[i] = i;
b[i] = VEC_SIZE-i;
result[i]=0;
}
#pragma acc kernels copyin(a,b) copy(result)
for (i=0; i<VEC_SIZE; i++) {
result[i] = a[i]+b[i];
}
// verify result
for (i=0; i<VEC_SIZE; i++) {
if ( (a[i] + b[i]) != result[i]) {
fprintf(stderr, "Incorrect results id %d val: %d \n", i, result[i]);
}
}
Returns the following:
Incorrect results id 0 val: 0
Which means all results but the one at index 0 is correct, it seems like the result for index zero is not copied over from the device.
Is this a compiler/runtime bug or did I miss something in regards of my coding?
Upvotes: 0
Views: 79
Reputation: 655
Yes, I also think that is a bug of your compiler, because your code looks right, you can have a try PGI complier, I am using it now, and it belongs to NVIDIA now. Besides, you can change your code "copy(result)" to "copyout(result)" to decrease memory I/O time, because the initial value of result is useless for device.
Upvotes: 0