user1873073
user1873073

Reputation: 3660

Is clEnqueueNDRangeKernel ever synchronous?

The following code does not use any callbacks or clWaitForEvents and yet it works perfectly. But I thought clEnqueueNDRangeKernel was non-blocking.

void CL::executeApp1()
{
    cl_int status = 0;
    const int d1Size = 1024000;
    int* myInt = new int[d1Size];

    cl_mem mem1 = clCreateBuffer(context, 0, sizeof(int)*d1Size, NULL, &status);
    status = clEnqueueWriteBuffer(queue, mem1, CL_TRUE, 0, sizeof(int)*d1Size, myInt, 0, NULL, NULL);
    status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &mem1);

    size_t global[] = {d1Size};
    cl_event execute;
    status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global, NULL, 0, NULL, &execute);
    //clWaitForEvents(1, &execute);
    status = clEnqueueReadBuffer(queue, mem1, CL_FALSE, 0, sizeof(int)*d1Size, myInt, 0, NULL, NULL);

    string s = "";
    for(int i = 0; i < d1Size; i++)
    {
        s += to_string(myInt[i]);
        s += " ";
    }

    result = (char*)malloc(sizeof(char)*s.length());
    strcpy(result, s.c_str());
}

Upvotes: 1

Views: 1180

Answers (1)

DarkZeros
DarkZeros

Reputation: 8410

Thats true is non-blocking.

However you only have 1 queue, and it is probably not set as OUT_OF_ORDER_QUEUE. So, it will run everything in order.

First the write, then the kernel, and finally the read. If you don't use two queues for IO and execution, the only call that needs to be blocking is the readBuffer().

Upvotes: 2

Related Questions