Reputation: 862
I have following openCL kernel I want to debug. I have put some printf in it but those are not useful as work items are schedules randomly and values printed are not always right. How I can make my work items in kernel execute in serial for debugging purpose?
Following is code
__kernel
void SampleKernel( __global float4* gVtx, __global float4* gColor,
__global float4* gDst,
const int cNvtx,
const int4 cRes )
{
printf("nVertex : %d ", cNvtx);
for(int i =0 ; i < 1; i+=4)
{
printf(" %f ", gVtx[0].x);
printf(" %f ", gVtx[0].y);
printf(" %f ", gVtx[0].z);
printf(" %f ", gVtx[0].w);
}
}
I have also tried putting calls barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
before and after printf
but it was not useful.
Can anybody please suggest me way I can serialize work item execution so I can print and debug kernel? Or some other better way to debug OpenCL kernel. I am using RX 580 AMD GPU.
Upvotes: 3
Views: 6075
Reputation: 1221
Some suggestions: you can use global id and group id to control which thread to print, and when you print, also print out the thread and group id. This would significantly reduce the complexity of the printed info and give you more control over the information you may need.
Another tip is that, please try to group multiple prints into a single one if possible; for instance, it is not a good debugging method if we use print as follows
printf(" %f ", gVtx[0].x);
printf(" %f ", gVtx[0].y);
printf(" %f ", gVtx[0].z);
printf(" %f ", gVtx[0].w);
you had better print them all in once to avoid them being interleaved by other prints from other threads.
With above two tips, it might be easier to handle the debugging kernels.
Upvotes: 5