Reputation: 3660
The following code does not use any callbacks or clWaitForEvents
and yet it works perfectly. But I thought clEnqueueNDRangeKernel was non-blocking.
void CL::executeApp1()
{
cl_int status = 0;
const int d1Size = 1024000;
int* myInt = new int[d1Size];
cl_mem mem1 = clCreateBuffer(context, 0, sizeof(int)*d1Size, NULL, &status);
status = clEnqueueWriteBuffer(queue, mem1, CL_TRUE, 0, sizeof(int)*d1Size, myInt, 0, NULL, NULL);
status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &mem1);
size_t global[] = {d1Size};
cl_event execute;
status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global, NULL, 0, NULL, &execute);
//clWaitForEvents(1, &execute);
status = clEnqueueReadBuffer(queue, mem1, CL_FALSE, 0, sizeof(int)*d1Size, myInt, 0, NULL, NULL);
string s = "";
for(int i = 0; i < d1Size; i++)
{
s += to_string(myInt[i]);
s += " ";
}
result = (char*)malloc(sizeof(char)*s.length());
strcpy(result, s.c_str());
}
Upvotes: 1
Views: 1180
Reputation: 8410
Thats true is non-blocking.
However you only have 1 queue, and it is probably not set as OUT_OF_ORDER_QUEUE. So, it will run everything in order.
First the write, then the kernel, and finally the read. If you don't use two queues for IO and execution, the only call that needs to be blocking is the readBuffer().
Upvotes: 2