Reputation: 1116
I have a problem with wrong global_id()
result. I would like to convolute 3D voxel with dimension {35,35,35}
with 3D kernel with dimension {5,5,5}
. Therefore, I call "clEnqueueNDRangeKernel" with global_size = {35,35,35}
and local size = { 5, 5, 5}
std::vector<size_t> local_nd = { 5, 5, 5 };
std::vector<size_t> global_nd = { 35, 35, 35 };
err = clEnqueueNDRangeKernel( queue, hello_kernel, work_dim, NULL, global_nd.data(), local_nd.data(), 0, NULL, NULL);
What I expect when I call get_global_id()
function is
the global_id(0)
should be between 0 to 34
global_id(1)
should be between 0 to 34
and global_id(2)
should be between 0 to 34.
However for global_id(0)
and global_id(1)
the results seem correct.
However the global_id(2)
the values range from 30 - 34 instead of
0 to 34 that I expect.
const int ic0 = get_global_id(0); // icol
const int ic1 = get_global_id(1); // irow
const int ic2 = get_global_id(2); // idep
printf(" %d %d %d\n", ic0, ic1, ic2 );
// value of ic0 = [0 -> 34] correct!
// value of ic1 = [0 -> 34] correct!
// value of ic2 = [30 -> 34] ( SHOULD IT BE [0->34] )?
my gpu is max-workgroup is max work-group item ND: { 1024, 1024, 64 }
Upvotes: 0
Views: 747
Reputation: 1116
I found the problem as pmdj suggested.
printf in kernels isn't always reliable - there's often a fixed-size buffer, and if you output too much, some messages may be dropped.
After I changed the OpenCL code with some conditions. ex:
if( ic2< 10 )
printf("ic2: %d ", ic2 );
The output ranges from [0 --> 34 as I expected]
Upvotes: 1